E0726

A MMCIF TOOLBOX FOR CCP4 APPLICATIONS. Peter A. Keller, CCP4, Daresbury Laboratory, Warrington WA4 4AD, UK

A programmers' toolbox of routines has been written, for manipulation of CIF's. It has been designed to integrate with the CCP4 protein crystallography suite[1], and enables existing CCP4 applications (written in Fortran77) to be converted to use appropriate categories of CIF information. New applications which conform to existing CCP4 practice, are also easily written.

Input CIF's are parsed according to the STAR syntax rules[2] (implemented using a PCCTS[3] grammar), and an abstract syntax tree (AST) representation of the data is built. Information on the location of each data item is maintained internally, for rapid retrieval. CIF data is generated or modified by constructing or manipulating an AST. Data verification is performed against a dictionary conforming to the Macromolecular DDL version 2.1[4]. The dictionary itself is converted to a binary representation which incorporates a hashed lookup on data names.

The Fortran interface shields the programmer from the more complex aspects of accessing CIF's, by keeping all data structures internal to the toolbox. To access data, application programs specify files, data blocks and data names, with the values being handled as the Fortran data type appropriate to that defined in the CIF dictionary. A system of contexts makes the data access routines effectively re-entrant, which simplifies multiple accesses to data categories, and only a minimum of information needs to be maintained by the application itself between calls to toolbox routines.

[1] CCP4. Acta Cryst. D50, 760-763.(1994)

[2] S.R. Hall, N. Spadaccini. J. Chem. Inf. Comput. Sci. 34, 505-508 (1994)

[3] T.J. Parr. `The Purdue Compiler Construction Tool Set'. See http://www.igs.net/~mtr/software-development/pccts.htm

[4] J. Westbrook, S.R. Hall. See http://ndbserver.rutgers.edu/DDL/index.html