E0274

STRUCTURAL DIVERSITY OF SEQUENTIALLY IDENTICAL SUBSEQUENCES OF PROTEINS Sucha Sudarsanam and Subhashini Srinivasan, Department of Protein Chemistry, Immunex Corporation, Seattle, WA 98101

Spectroscopic studies indicate that protein folding occurs through the formation of stable secondary structures as intermediates. Consequently, tertiary structure prediction algorithms have approached protein folding by attempting to predict secondary structures and assembling them to form tertiary structures. The success of these algorithms depend on the accurate prediction of local structures encoded by subsequences, herein termed n-mers, of proteins. In this context, one of the important questions is: what is the minimum number of amino acids needed to form unique structures that are stabilized only by local interactions? The most recent analysis (Cohen et al., Prot. Sci. 2, 2134-2145, 1993) using the July 1990 release of the PDB found that identical 6-mers can have dissimilar structures. Given the explosive growth of the PDB since then, we have revisited this question.

We have analyzed unrelated protein structures (as measured by pairwise sequence identity after an optimal global sequence alignment) in the most recent release of the PDB for identical n-mers. A database consisting of sequences of polypeptide chains along with their backbone dihedral angles fi+1, yi, where i = 1, m - 1 and m is the length of a polypeptide chain, was constructed using procedures described earlier (Sudarsanam et al., Prot. Sci., 4, 1412-1420, 1995). This database can be thought of as a "condensed" version of the PDB with sequence and structural information for backbone conformations. For each polypeptide chain n-mers, where n >= 5, were searched against the database for identical matches. Structural similarity of a pair of n-mers was measured by backbone root mean square deviation.

We find the population of 6-mers with identical sequences but dissimilar structures have increased since the last study. In addition, we find at least one pair of identical 7-mers with dissimilar structures. The ability of identical n-mers to adopt different conformations emphasizes the complex interplay of short and long range interactions in protein folding which will be discussed.