E0730

STAR/CIF MACROMOLECULAR NMR DATA DICTIONARIES AND DATA FILE FORMATS. Eldon L. Ulrich, David Argentar, Amy Klimowicz, William M. Westler, and John L. Markley, Dept. of Biochemistry, University of Wisconsin- Madison, Madison, WI 53706

BioMagResBank (BMRB), a database for information on macromolecular structure derived from NMR spectroscopy, has begun using STAR files and STAR and CIF data dictionaries for archiving and exchanging pertinent data. So that overlapping features of NMR and crystallographic data will be represented in a common format, we have collaborated with the Protein Data Bank (PDB) and the mmCIF authors in developing the STAR and CIF data dictionaries (NMRIF dictionary). One of the challenges has been to extend existing protocols to handle the wide variety of spectral, kinetic, thermodynamic, and structural data that can be derived from NMR spectroscopy. The STAR format has lent itself to precise descriptions of the kinds of complex systems studied by NMR spectroscopy, which may consist of a single macromolecular species in a solution of particular composition, heterogeneous molecular aggregates, or molecules that undergo dynamic processes such as chemical reactions or conformational interconversions. The data in a single report frequently are derived from the results of multiple experiments where the variables typically are the chemical composition of the sample or parameters in the NMR experiments employed for data collection. Authors may wish to report the primary data in a way that captures the conditions and protocols of individual experiments. We have endeavored to develop a file format that can be easily understood and manipulated by the domain scientist (specialist in macromolecular NMR). The STAR format, in particular the "save frame" construct, has allowed us to encapsulate information pertaining to unique entities (molecules, samples, experimental procedures, sets of results, etc.) and to link related entities in a relatively efficient manner. In addition, the ability to limit the scope of a data tag within a save frame tremendously reduces the number of redundant data tags that are needed within a single file. The content of the STAR and CIF NMR data dictionaries and their use in designing the BMRB NMR data deposition form and in exporting data from the BMRB database will be described.

Supported by a grant from the National Library of Medicine LM04958- 05S3, a Dept. of Energy subcontract from the Protein Data Bank, and the graduate school of the Univ. of Wisconsin.