


Feature article
Update on Protein Data Bank activities
![[PDB logo]](https://www.iucr.org/__data/assets/image/0005/20498/pdblogo.jpg)
Data Deposition
From July, 1999 to June, 2000, 2292 structures were deposited at the RCSB and 468 backlog entries and 456 'layer 1' entries were processed. The data rate during the period was on average 44 depositions per week. The average time to fully process an entry was less than 12 days. 81% are from x-ray experiments, 15% are from NMR experiments, 61% of data deposited are from North America, 25% from Europe, 11% from Asia, and 2.4 % from Australia. Proteins make up 89% of the depositions, while 11% are nucleic acids. Of the structures deposited, 20% were indicated by the author to be held until a particular date, 57% were indicated to be held until the publication of the corresponding article, and 23% were indicated to be released immediately.
A popular PDB feature, the Validation Server (ADIT), allows depositors to check a structure at any time during structure determination and refinement. It checks the format consistency of coordinates during the Precheck step, and creates validation reports about a structure before deposition using ADIT (http://pdb.rutgers.edu/). Once deposited, entries are processed to completion, returned to the author for review, and released on the PDB site (www.pdb.org/) and its mirrors. The PDB staff continues to enhance and upgrade the capabilities of the PDB searching and reporting tools. As part of the Data Uniformity project, PDB members have curated the R-factor, resolution data, and primary citation data for all entries in the PDB, and have incorporated this information into the database. These fields are available for improved searching, and the updated data are available via database reports.
Data Distribution
The site’s 'Get Educated' page includes an introduction to proteins for general audiences and materials for undergraduates on topics such as nucleic acids, principles of protein structure, and electron microscopy. Tutorials are available on two popular molecular graphics viewing programs, how to query the PDB and how to use RasMol and the Swiss-PDB Viewer (Guex, Peitsch 1997). Links are frequently added to this resource, which also includes papers on the PDB, animated presentations about the PDB, and VRML 'protein documentaries' developed by students. The electronic help desk at info@rcsb.org, which is available to answer all types of questions about the PDB, usually within a 24-hour period.
Other developments in query and reporting include expanded ligand searching and reporting capabilities, improved access to dynamic links using the Molecular Information Agent (http://mia.sdsc.edu), the accurate query of enzymes, the incorporation of cross-links to sequences databases, and improved graphics options. The PDB can now be queried based on source, by number of chains, and by the availability of experimental data.
Each month a key biological molecule is profiled as the Molecule of the Month. Beautiful images of the molecule are provided by D. Goodsell of the Scripps Research Inst. and featured on the PDB home page and links provide additional information about the structure and function of the molecule at a general level.
Usage, which grew in the initial months of operation, has now leveled off at about 90,000 Web hits per day and 70,000 PDB files downloaded per day.
![[RCSB PDB team]](https://www.iucr.org/__data/assets/image/0016/20464/rcsbpdbteam.jpg)
It is estimated that the PDB could grow to approximately 35,000 structures by 2005, nearly tripling its size. A major factor in this growth is structural proteomics, the determination of the structures of as many of the proteins as possible, in the shortest time possible. This increased volume will present a challenge to the PDB. As technology advances, the PDB’s user base will also expand. In order to accommodate this demand, the RCSB plans to enhance the robustness of the PDB’s query capabilities. The RCSB is proceeding with the next phase of archiving the physical data, which involves scanning and electronically storing all documents associated with the PDB. Data uniformity work will continue by focusing on structure classification, compound records, chain ID fields, refinement parameters, coordinates, sequence records, and the biological unit.
(taken from the PDB annual report July 1999-June 2000)