Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Unifying chemistry and crystallography

We (Simon ("Billy") Tyrrell and I have been working on interpreting 
crystallography in terms of chemistry. With the help and encouragement of 
the IUCr Chester office (esp. Peter and Brian) we have taken the CIFs from 
an issue of ActaE and converted them to CML. These are early findings but 
they look good. The following comments are provisional and please forgive 
us if they are standard knowledge and sound naive.

Almost all (> 98%) of the ca 200 CIFs convert well into chemically 
meaningful structures. We have computed the formula_sum from the 
_atom_site_occupancy corrected for multiplicity (from the symmetry 
operators) and find this agrees almost universally with the 
_chemical_formula_sum. We have computed the molecular mass and this also 
agrees completely - a few large compounds differ very slightly and suggest 
that the atomic masses used may differ between our implementation and the 
programs used to compute the CIFs.  In particular the tools suggest that 
all authors account explicitly for all hydrogen atoms in the structures.

We have then used CML-based software to display the connection tables for 
the entries.  Where a structure is reported as disordered we have 
arbitrarily taken the first (usually the largest occupied) disorder group. 
At this stage the connection tables are only meaningful for cases where the 
full "molecule" has no crystallographic symmetry elements. For ionic 
structures they are also not yet decorated with charges. The connection 
tables agree well with the reported structure diagram and/or chemical name.

It is now possible to check the formulae against the 
_chemical_formula_moiety - which is the only explicit way of assigning 
charges. Again the agreement seems to be good.

For the molecules which lie on symmetry elements there may be a degree of 
subjectivity as to how the complete molecule is constructed, especially if 
the molecule is polymeric in 1, 2 or 3 directions. One approach is to 
generate symmetry related atoms and see if they join onto the growing 
fragment. We have also used a heuristic approach which is to use the 
author-provided lengths, angles and torsions as defining the chemical 
connectivity. Where the molecule has symmetry a number of bonds and angles 
are repeated together with their symmetry operator. This allows immediate 
identification of the symmetry operations required to generate a larger 
molecule. By adding on the symmetry generated replicas and removing common 
atoms it is normally possible to generate a complete connection table 
compatible with the formula sum or formula moiety and with the name or 
structural diagram. We have not yet automatically checked the results of 
this exercise against the formula_moiety.

This implies that the crystallographers (and/or the programs they use 
and/or the Acta office) have good discipline in reporting the chemistry of 
the compounds and that it is reasonable to request enhanced chemical 
information in standard CIFs. It suggest that the major programs used 
already implicitly store chemical connection tables and should be able to 
emit them.

In recent discussions with crystallographers it seems that the preparation 
of the chemistry for publication is a significant fraction of the time 
taken to "do a crystal structure". Labelling atoms and describing symmetry 
is a particular concern



>coreCIFchem mailing list

Peter Murray-Rust
Unilever Centre for Molecular Informatics
Chemistry Department, Cambridge University
Lensfield Road, CAMBRIDGE, CB2 1EW, UK
Tel: +44-1223-763069

coreCIFchem mailing list

[Send comment to list secretary]
[Reply to list (subscribers only)]