S0232

THE PROTEIN DATA BANK AND THE CAMBRIDGE STRUCTURAL DATABASE: INTER-RELATIONSHIPS IN CONSTRUCTION AND UTILIZATION. Frank H. Allen, Cambridge Crystallographic Data Centre, 12 Union Road, Cambridge CB2 lEZ, England

Proteins are chemically simple but structurally complex, while small molecules are chemically complex but structurally simple. Because of, or even despite, these (rather generalised) differences, there are a number of areas in which the two Centres can collaborate in database construction, and many areas in which the two databases can be used together in research applications.

Both databases are growing at speed. The present PDB doubling period is ca. 3 years - similar to that of the CSD in the early 1970's - while the CSD currently doubles every 6-7 years. This is a crucial time for database creators, as experimentalists, journals and databases are brought ever closer by rapid communications. Novel publication routes are emerging, but we must ensure that data capture and integrity are also improved by these new mechanisms. Both databases are automating direct data deposition procedures based on CIF, mmCIF and MIF. More specifically, the PDB and CSD are collaborating to encode atomic level chemical connection tables for protein-bound ligand molecules, which will then be available for 2D-substructure and 3D-structure searching by suitable software.

At the applications level, small molecule data are used to enhance macromolecular model-building, structure refinement and interpretation. Protein data, particularly from protein-ligand complexes, can indicate the chemical complexity needed to design novel biologically active ligands. Knowledge derived from small molecule structures can, in its turn, indicate possible binding modes for proposed new actives. These synergies will be illustrated in the talk.