Feature article

Structural genomics

[Soug-Hou Kim]Crystallography is poised to play a critical role in research to determine the biochemical functions of proteins encoded by human, animal, insect, and bacterial genes. Since the function of a gene product is tightly coupled to its three-dimensional structure, determining the structure, or its folding pattern can provide insight into its biochemical function. The human genome alone is estimated to have >80,000 genes. The current estimate of genes that encode products with ‘known’ biochemical and biological functions varies from approximately 30-60% depending on the organism. An examination of the Brookhaven Protein Data Bank reveals a smaller number of new folds are discovered each year. Many proteins are composed of two or more folding domains. Some folds are found among proteins from all three kingdoms: bacteria, archaea and eukarya. It is postulated that there are a finite number of folds making up a 'fold basis set'. Several studies revealed that only about 20-35% of open reading frames of known genomic sequences are represented in the current protein structure base. Gene products of unknown function having no sequence similarity with protein of known function and membrane-bound proteins constitute a major portion of the unknown folds. Assuming that the PDB collection presently represents about one third of all protein folds, 700-10,000 new folds need to be experimentally discovered. The structure determination of gene products by crystallographic and NMR methods will need to be coordinated internationally. Development, optimization and automation are needed for large scale cloning and expression of chosen genes, and high throughput purification of the genes products. Also needed are high throughput crystallization screening method and automation of X-ray diffraction data acquisition. Synchrotron radiation is essential not only for the highest quality data acquisition and throughput of data but also for working with microcrystals and for structure determination using the multiple wavelength anomalous diffraction (MAD). There is a clear and compelling role for crystallography in providing a foundation for functional genomics, which is the ultimate objective of sequencing the entire genome of an organism.

Soug-Hou Kim
condensed from Nature Structural Biology, Aug 1998