LOCAL MACROMOLECULAR STRUCTURE DATABASE FOR CRYSTALLOGRAPHIC LABORATORIES

Philip E. Bourne and Ilya N. Shindyalov, San Diego Supercomputer Center, PO Box 85608, San Diego CA 92186-9784.

As the number of macromolecular structures continues to grow exponentially, the need for a compact, easy to load and easy to query labortaory based database system would seem important. Such a database should be capable of loading all or a subset of the structures found in the Protein Data Bank (PDB) as well as maintaing local data which is in PDB format. The ideal system should contain native and derived data, should run on a variety of Unix platforms and should have a Web-based graphical interface to query the database. This paper reports on the design, capabilities and availability of such a database system. The San Diego Supercomputer Center (SDSC) version of the database is available via the World Wide Web on multiple servers (http://www.sdsc.edu/moose) and is within 24 hours of being current with the PDB native distribution as found in the PDB ftp archives. Using similar compression algorithms as found in WPDB [1] reduces data strorage requirements 10 fold over native data without any loss of precision and aklso includes additional derived data.

Apart from the obvious types of queries based on authors, protein names and other basic information, queries can be made with respect to characteristics of the polypeptide chain, for example, sequence patterns (with gaps), patterns of secondary structure elements, percentage of different kinds of secondary structure elements. The most extensive queries are those based on patterns of amino acid properties. It is possible to search for patterns combing such properties as environmental exposure, hydrophobicity, volume, polarity, isoelectric point, B values and so on. By taking properties from a given primary sequence and applying appropriate thresholds threading is possible. Alternatively, the starting properties can be taken from a known structure and structure similarity determined. A range of graphical tools can be applied to structures found in querying. These include plots of various property patterns, contact maps and 3-D images.

The database system can be obtained by contacting the first author.

[1] I.N Shindyalov and P.E.Bourne (1995), J. App. Cryst. 28(6) 847-852.