Bookmark and Share

CIF, COMCIFS, and other buzz words

[Brown] I. D. Brown, 'No pictures please.'

It is two years since Acta Crystallographica C started accepting structural papers in electronic form and the Crystallographic Information File (CIF), an evolving protocol for data interchange, was adopted. A CIF uses the STAR file structure (S. R. Hall, J. Chem. Inf Comput. Sci. 31, (1991) 326-333) in which each item of information, whether a number or a piece of text is preceded by a 'data name' which defines what kind of information follows. For example, the title of the paper is preceded by the name '_publ_section_title', and the Hermann-Mauguan space group symbol by the name '_symmetry_space_group_name_H-M'. To use the file, one needs a dictionary which lists the names of all the allowed data fields together with a precise definition of the information that the field contains.

The CIF that contains a structure determination submitted to Acta Crystallographica is used for a number of purposes. Firstly, the editorial staff use the CIF as input to a checking program to ensure that the numbers are self-consistent and have reasonable values. After the paper is accepted, the CIF is again used as input to the program that typesets the text and automatically creates the tables of atomic coordinates, bond lengths, and angles. Finally the CIF is sent to the appropriate crystal structure data base so that the numerical results can be made available on-line.

Crystallography is not static but changes over time and the CIF dictionary already published in Acta Cryst. A47 (1991) 655-685 needs to evolve with the subject. Unfortunately, it is not possible to remove data names from the dictionary without invalidating files written using older names and definitions. Like the laws of the ancient Medes and Persians, no data name or definition can be altered or deleted once it has been adopted. All we can do is to add new names. To monitor and approve the additions, the International Union of Crystallography (IUCr) has established the Committee for the Maintenance of the CIF Standard (COMCIFS) whose current members are I. D. Brown (chair), B. McMahon (secretary), F. H. Allen, P. M. D. Fitzgerald, S. R. Hall, and B. H. Toby with H. D. Flack and G. M. Sheldrick acting as consultants.

The first version of the dictionary (1991) was designed to cover the needs of small molecule crystallographers. Since then, the Int'l Center for Diffraction Data (ICDD) has adopted the format for the Powder Data File and the macromolecular community has adopted the CIF structure for the Protein Data Bank and related data bases. This has involved the definition of a large set of new data names. It is easy to provide definitions for well-defined concepts such at the lattice parameters or atomic coordinates, but it is less easy to know how one should describe the restraints that may have been applied during a refinement. Different program packages currently handle restraints in different ways. Once there is consensus on how restraints are used, a CIF dictionary definition can be made, but what do we do in the meantime? How one deals with concepts that are not yet universally adopted is one of the problems the members of COMCIFS are currently wrestling with.

What should you do if you want to propose an extension to the CIF dictionary to cover a new field of crystallography? In the first instance you should contact the chair of COMCIFS who will, if appropriate, establish a working party to prepare an initial set of definitions. These will then be checked by COMCIFS to ensure that they conform to the STAR and CIF standards and do not duplicate existing names. COMCIFS will then publish a draft dictionary extension and ask people in the field for comment. When all these have been taken into account. COMCIFS will approve the extension, at which times the definitions become standard and unalterable.

COMCIFS held its first meeting at the IUCr Congress in Beijing last summer. Since then it has been in regular session by e-mail.

If you have any suggestions or comments on the Crystallographic Information File, please contact me (idbrown@mcmaster.ca), the secretary (bm@iucr.ac.uk), or any of the members.

I.D. Brown
Chair, COMCIFS