The Crystallographic Information File (CIF) is the standard file format for the exchange of crystallographic information published by the International Union of Crystallography (IUCr) in 1991.
A CIF is an ASCII file conforming to the STAR File syntax with some additional restrictions. It contains tags identifying many quantities or elements of information, and the associated values. The meanings of the tags are specified in standard library files known as CIF dictionaries, although files may also contain pieces of information defined implicitly or in non-standard dictionaries.
The specification of the CIF standard can be found on the web pages at http://www.iucr.org/iucr-top/cif.
Several standard dictionaries of data names are maintained for use in different branches of crystallography or related structural sciences. The various dictionaries and the project teams responsible for their specification and maintenance are identified by acronyms such as
Information on these and other specialised dictionaries may be found on the CIF home page.
Reference:
Hall, S. R., Allen, F. H. and Brown, I. D. (1991). Acta Cryst. A47,
655-685.
Nothing is necessarily "wrong" with any of these file formats listed; but typically they serve the purpose of a specific program; in some cases are undocumented or not stable over time; and are unstructured. CIF, however, is specifically designed for data exchange between software packages of many different types. It is publicly documented and intended to be very stable; and it also describes in detail the data model appropriate for the data it conveys.
But most importantly, it provides a framework for all data of crystallographic interest: raw intensity sets, reduced data sets, images, graphical representations, tables of derived data, experimental logs, or literature submissions. It has a very rich vocabulary of recognised data names drawn up by experts in crystallography. While it may be useful to transform some or all of the contents of a CIF to some or all of the above formats, the CIF remains the single most comprehensive repository of information suitable for archiving or data interchange.
STAR is an acronym for Self-defining Text Archive and Retrieval. It refers to a file format published in 1990, to satisfy the following design goals: the data structure is completely self-defined; the data items are completely self-defined; the data syntax rules are few and simple; the data may be of any type and in any order; the file is easy to read visually, or by machine.
References:
Hall, S. R. (1991). J. Chem. Inf. Comput. Sci. 31, 326-333.
Hall, S. R. and Spadaccini, N. (1994). J. Chem. Inf. Comput. Sci.
34, 505-508.
A CIF is a STAR File with additional syntax rules, conventions on data typing, and using data names with crystallographic relevance defined in external library files, known as data dictionaries.
STAR Files (with or without the syntax restrictions of CIF) are used in other disciplines. Examples are the NMR-STAR datasets of the BioMagResBank, the World Database of Crystallographers source files, and the Molecular Information File (MIF). Clearly, the less another STAR application departs from the restricted CIF syntax, the easier it will be to handle overlap between CIF and non-CIF applications.
Yes. A formal Backus-Naur specification of the STAR syntax was published in J. Chem. Inf. Comput. Sci. (1994), 34, p. 507.
DDL is the Dictionary Definition Language, a formal mechanism for defining data types and data dependencies in CIF or in certain other STAR File applications.
Here is a simple example. In the CIF Core dictionary, the data tag _chemical_melting_point that identifies the melting point of a chemical compound is defined in the following way:
data_chemical_melting_point _name '_chemical_melting_point' _category chemical _type numb _enumeration_range 0.0: _units K _units_detail 'kelvins' _definition ; The temperature in kelvins at which a crystalline solid changes to a liquid. ;The dictionary is itself a STAR File; the identifiers _name, _type etc. comprise the DDL. From this example, it can be seen that the purpose of many of the terms in DDL can be readily deduced; however, the formalism permits machine parsing of dictionaries and consequent automatic validation of CIFs (or other STAR Files applications) containing the data names defined in the dictionaries.
Reference:
McMahon, B. (1995). A Brief History of the DDL.
http://www.iucr.org/iucr-top/cif/ddlhist.html.
DDL0 is the name sometimes given to the set of properties associated with CIF data names in the original CIF dictionary published in 1991. These have been superseded by more refined formalisms, and it is not in general considered necessary that DDL parsers should recognise these properties.
DDL version 1 is the set of properties associated with CIF data names in the coreCIF dictionary and in other CIF extension dictionaries, largely concerned with small-molecule crystallography.
References:
Hall, S. R. and Cook, A. P. F. (1995). J. Chem. Inf. Comput. Sci..
35, 819-825.
DDL Version 1.4 Dictionary (1995).
ftp://ftp.iucr.org/pub/ddldic.c95.
DDL version 2 is the set of properties associated with CIF data names in the mmCIF dictionary for macromolecular structures, and in related projects such as the imgCIF work on images. It embraces a data model that is very close to that of a relational database. CIF data names described by DDL2 include a period character `.' to indicate the category (or, equivalently, relational table) to which they have been assigned.
Reference:
DDL Version 2.1.1 Dictionary (1995).
ftp://ftp.iucr.org/pub/ddl2.c96.
DDL version 3 is the name given to some work in progress by Syd Hall and his colleagues at the University of Western Australia. It is intended to build on the greater consistency and data typing abilities of DDL2 without tying the data model too closely to that of a relational database. A particular goal of DDL3 is to introduce methods into data dictionaries through a formal language known as dREL (Dictionary Regular Expression Language).
CBF is the Crystallographic Binary File. It is an outgrowth from the imgCIF project to describe area-detector and image-plate images in CIF terms. While the imgCIF dictionary allows a fully compliant CIF archive file to be created of an image, it was felt that the ASCII representation of huge data sets was unnecessarily verbose, and a packed binary representation was designed to be of more use in everyday working. The information content of a CBF is identical to that of its associated imgCIF ASCII representation: only the internal data storage (and file size) differ.
No. There are no IUCr Police.
The IUCr will vigorously protect CIF and STAR as standards for our community. In most cases, we will try to talk to you and get you back on the right track, but, if we must resort to legal action to protect the interests of the community, we will do what we must.
We want you and your colleagues to be able to be confident that when you receive a file and it is represented to you as being a CIF or a STAR file, that it is, indeed a CIF or a STAR file. So, if somebody starts distributing files which are called CIF or STAR, but which don't comply with the rules for making CIF or STAR files, and we know about it, we will ask them to stop doing that.
We want you and your colleagues to be able to be confident that when you work with software that claims it reads or writes CIF or STAR files, that indeed it does do that. So if somebody produces what they claim to be a CIF or STAR application, we want to be able to take a look at the program and documentation and assure ourselves that it is. If the program is in the public domain, we can do that. If somebody wants to do something more restrictive, things can be worked out, but we have to be able to assure ourselves and you that what claims to be a CIF or STAR application is, indeed working with CIF or STAR.
We will not permit any other organization to "capture" STAR or CIF and try to ransom it back to the community.
There is more that we can and will do, but these are the most important parts.
We will protect CIF and STAR as trademarks and service marks of the IUCr. We will protect the IUCr STAR patent. We will protect the copyright on the various "public" CIF dictionaries and on the DDL's. We will protect CIF and STAR as effective standards for the community.
Hypothetically speaking, we suppose it is possible that in some arrangement for some commerical product involving CIF or STAR, there might be grounds for the IUCr to collect a license fee or royalty for something, so we don't want to totally rule out the possibility, but, in most cases all we are trying to do is to get CIF and STAR to be widely and freely used so that it is easier for our community to exchange data reliably. Certainly, if you are putting out your CIF or STAR compliant application to the world for free, we are not going to ask you to start charging money for it so that you can pay the IUCr a license fee.
Copyright © International Union of Crystallography
IUCr Webmaster