Hello again - In what I think was private email (I get confused with so many discussions going on) Peter asked where to find the 5HVP example file. I used to be able to generate the example directly from the dictionary, but I found when I tried to do that just now that recent versions of the dictionary have broken my program that does this (I know, I know, that's the problem with tools that are just hacks). Anyway, I thought it might be worth sending all of you the most recent version of the example that I have - it is dated 1993-10-15. Have fun. Paula - - - - - THE MACROMOLECULAR CIF DICTIONARY - 15 Oct 1993 Paula Fitzgerald Merck Research Laboratories P. O. Box 2000, Ry50-105 Rahway, New Jersey 07065 (908) 594-5510 (voice) (908) 594-5510 (FAX) paula_fitzgerald@merck.com (email) (n.b. CIF example current as of 15-Oct-93, but notes have not been updated since 20-May-93). CIF = Crystallographic Information File CIF is a subset of STAR (Self-defining Test Archive and Retrival format) S.R. Hall (1991) J. of Chemical Information and Computer Science, 31, 326-333. The format is suitable for archiving all types of text and numerical data, in any order. The goals of CIF are generality, upward compatibility, flexibility and electronic publication. CIF was developed by the IUCr Working Party on Crystallographic Information, in an effort sponsored by the IUCr Commission on Crystallographic Data and the IUCr Commission on Journals. The result of this effort was a dictionary of data items sufficient for archiving the small molecule crystallographic experiment and its results (S.R. Hall, F.H. Allen and I.D. Brown (1991) Acta Cryst. A47, 655-685). This dictionary was adopted by the IUCr at its 1990 Congress in Bordeaux. CIF is now the format in which structure papers are submitted to Acta Crystallographica Volume C; software have been devel- oped to automatically typeset a paper from a CIF. In 1990, the IUCr formed a working group to expand the dictionary to include data items revelant to the macromolecular crystallographic experiment. This working group is chaired by Paula Fitzgerald (Merck); the members of the group are Enrique Abola (Protein Data Bank), Helen Berman (Rutgers), Phil Bourne (Columbia), Eleanor Dodson (York), Art Olson (Scripps), Wolf- gang Steigemann (Martinsried), Lynn Ten Eyck (UCSD) and Keith Watenpaugh (Upjohn). The short term goal of the working group is to fulfill the mandate set by the IUCr: to define CIF data names that need to be added to the core CIF dictionary in order to adequately describe the macromolecular crystallo- graphic experiment and its results. But the working group also feels that it has long term goals as well: to provide sufficient data names so that the experimental section of a structure paper could be written automatically and to facilitate the development of tools so that computer programs can easily interface directly with the CIF. This involves generating a community-wide consensus about the completeness and accuracy of the data names and soliciting the involvement of the community in the development of the needed tools. The two and a half years in which the macromolecular CIF effort have been underway have coincided with years of great change at the Protein Data Bank. The exponentially increasing volume of coordinate depositions demands a completely automated data processing protocol. In addition, wide-spread frustration has been growing with the fact that so much valuable information is stored in free-format remarks, and with the limitations imposed by the current fixed-format PDB data structure. Both of these factors have caused the PDB to realize that a new format is needed, and thus the PDB has decided that it will adopt CIF (or a subset of CIF) as its new exchange format. A draft version of the macromolecular CIF dictionary is now largely complete. Data items have been added to the CIF core to describe the phasing process, to describe more fully the quality of the diffraction data, and to describe the results of structure refinement. In addition to these experimental matters, a key effort has been made to define descriptors for the structure that will allow the user (crystallographer and non-crystallographer alike) to rapidly extract the biological structure (as distinct from the contents of the asymmetric unit). Other data items have been developed to maintain compatibility with the current PDB format. Most data items now have entries in the draft version of the dictionary, although in some instances the definitions are sketchy, and in some cases probably downright wrong. The working group has found at this point that the best way to reveal difficulties in the dictionary is to work through examples. Although we hesitate to begin circulating a document that we know still is not complete, we must start to get input from the community. This is a format that we will all have to deal with at one level or another, because it will be the new format for the PDB. We hope it will have much wider usage than that, providing a mechanism for transparent interchange between different programming environments, and providing a solution for the vexing problems of structure archiving that we all deal with. It thus behooves everyone in the macromolecular community to take an interest in the format while it is still fluid and changes (even major ones) can still be implemented. The example below will give you a flavor of what we are trying to do. All comments will be listened to, but those that propose constructive alternatives to unpopular features will be listened to most carefully. To some degree our hands are tied by features of the CIF core (32-character limits for data names, one level of nesting for loops) but in other areas we have greater flexibility. I would like to thank all of the members of the working group for their input into the process, but I would like to particularly acknowledge the efforts of Helen Berman, Phil Bourne and Keith Watenpaugh, who have labored mightily to bring the dictionary to its present state. - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - loop_ _atom_site_group_PDB _atom_site_type_symbol _atom_site_label_atom_id _atom_site_label_res_id _atom_site_label_asym_id _atom_site_label_seq_id _atom_site_label_alt_id _atom_site_Cartn_x _atom_site_Cartn_y _atom_site_Cartn_z _atom_site_occupancy _atom_site_B_iso_or_equiv _atom_site_footnote_id _atom_site_entity_id _atom_site_entity_seq_num ATOM N N VAL A 11 ? 25.369 30.691 11.795 1.00 17.93 ? 1 11 ATOM C CA VAL A 11 ? 25.970 31.965 12.332 1.00 17.75 ? 1 11 ATOM C C VAL A 11 ? 25.569 32.010 13.808 1.00 17.83 ? 1 11 ATOM O O VAL A 11 ? 24.735 31.190 14.167 1.00 17.53 ? 1 11 ATOM C CB VAL A 11 ? 25.379 33.146 11.540 1.00 17.66 ? 1 11 ATOM C CG1 VAL A 11 ? 25.584 33.034 10.030 1.00 18.86 ? 1 11 ATOM C CG2 VAL A 11 ? 23.933 33.309 11.872 1.00 17.12 ? 1 11 ATOM N N THR A 12 ? 26.095 32.930 14.590 1.00 18.97 4 1 12 ATOM C CA THR A 12 ? 25.734 32.995 16.032 1.00 19.80 4 1 12 ATOM C C THR A 12 ? 24.695 34.106 16.113 1.00 20.92 4 1 12 ATOM O O THR A 12 ? 24.869 35.118 15.421 1.00 21.84 4 1 12 ATOM C CB THR A 12 ? 26.911 33.346 17.018 1.00 20.51 4 1 12 ATOM O OG1 THR A 12 3 27.946 33.921 16.183 0.50 20.29 4 1 12 ATOM O OG1 THR A 12 4 27.769 32.142 17.103 0.50 20.59 4 1 12 ATOM C CG2 THR A 12 3 27.418 32.181 17.878 0.50 20.47 4 1 12 ATOM C CG2 THR A 12 4 26.489 33.778 18.426 0.50 20.00 4 1 12 ATOM N N ILE A 13 ? 23.664 33.855 16.884 1.00 22.08 ? 1 13 ATOM C CA ILE A 13 ? 22.623 34.850 17.093 1.00 23.44 ? 1 13 ATOM C C ILE A 13 ? 22.657 35.113 18.610 1.00 25.77 ? 1 13 ATOM O O ILE A 13 ? 23.123 34.250 19.406 1.00 26.28 ? 1 13 ATOM C CB ILE A 13 ? 21.236 34.463 16.492 1.00 22.67 ? 1 13 ATOM C CG1 ILE A 13 ? 20.478 33.469 17.371 1.00 22.14 ? 1 13 ATOM C CG2 ILE A 13 ? 21.357 33.986 15.016 1.00 21.75 ? 1 13 # - - - - data truncated for brevity - - - - ATOM C C1 APS C 300 1 4.171 29.012 7.116 0.58 17.27 1 2 ? ATOM C C2 APS C 300 1 4.949 27.758 6.793 0.58 16.95 1 2 ? ATOM O O3 APS C 300 1 4.800 26.678 7.393 0.58 16.85 1 2 ? ATOM N N4 APS C 300 1 5.930 27.841 5.869 0.58 16.43 1 2 ? # - - - - data truncated for brevity - - - - _atom_sites_Cartn_transform_axes 'c along z, astar along x, b along y' _atom_sites_Cartn_tran_matrix_11 58.39 _atom_sites_Cartn_tran_matrix_12 0.00 _atom_sites_Cartn_tran_matrix_13 0.00 _atom_sites_Cartn_tran_matrix_21 0.00 _atom_sites_Cartn_tran_matrix_22 86.70 _atom_sites_Cartn_tran_matrix_23 0.00 _atom_sites_Cartn_tran_matrix_31 0.00 _atom_sites_Cartn_tran_matrix_32 0.00 _atom_sites_Cartn_tran_matrix_33 46.27 _atom_sites_fract_tran_matrix_11 0.017126 _atom_sites_fract_tran_matrix_12 0.000000 _atom_sites_fract_tran_matrix_13 0.000000 _atom_sites_fract_tran_matrix_21 0.000000 _atom_sites_fract_tran_matrix_22 0.011534 _atom_sites_fract_tran_matrix_23 0.000000 _atom_sites_fract_tran_matrix_31 0.000000 _atom_sites_fract_tran_matrix_32 0.000000 _atom_sites_fract_tran_matrix_33 0.021612 loop_ _atom_sites_alt_id _atom_sites_alt_details '?' ; Atom sites with the alternate id set to null are not modelled in alter- nate conformations ; '1' ; Atom sites with the alternate id set to 1 have been modelled in alternate conformations with respect to atom sites marked with alternate conformation id 2. The conformations of amino acid side chains and solvent atoms with alternate id set to 1 correlate with the conformation of the inhibitor marked with alternate id 1. They have been given an occupancy of 0.58 to match the occupancy assigned to the inhibitor. ; '2' ; Atom sites with the alternate id set to 2 have been modelled in alternate conformations with respect to atom sites marked with alternate conformation id 1. The conformations of amino acid side chains and solvent atoms with alternate id set to 2 correlate with the conformation of the inhibitor marked with alternate id 2. They have been given an occupancy of 0.42 to match the occupancy assigned to the inhibitor. ; '3' ; Atom sites with the alternate id set to 3 have been modelled in alternate conformations with respect to atoms marked with alternate conformation id 4. The conformations of amino acid side chains and solvent atoms with alternate id set to 3 do not correlate with the conformation of the inhibitor. These atom sites have arbitrarily been given an occupancy of 0.50. ; '4' ; Atom sites with the alternate id set to 4 have been modelled in alternate conformations with respect to atoms marked with alternate conformation id 3. The conformations of amino acid side chains and solvent atoms with alternate id set to 4 do not correlate with the conformation of the inhibitor. These atom sites have arbitrarily been given an occupancy of 0.50. ; loop_ _atom_sites_alt_ens_id _atom_sites_alt_ens_details 'Ensemble 1-A' ; The inhibitor binds to the enzyme in two, roughly twofold symmetric, alternate conformations. This conformational ensemble includes the more populated conformation of the inhibitor (id=1) and the amino acid side chains and solvent structure that correlate with this inhibitor conformation. Also included are one set (id=3) of side chains with alternate conform- ations when the conformations are not correlated with the inhibitor conformation. ; 'Ensemble 1-B' ; The inhibitor binds to the enzyme in two, roughly twofold symmetric alternate conformations. This conformational ensemble includes the more populated conformation of the inhibitor (id=1) and the amino acid side chains and solvent structure that correlate with this inhibitor conformation. Also included are one set (id=4) of side chains with alternate conform- ations when the conformations are not correlated with the inhibitor conformation. ; 'Ensemble 2-A' ; The inhibitor binds to the enzyme in two, roughly twofold symmetric alternate conformations. This conformational ensemble includes the less populated conformation of the inhibitor (id=2) and the amino acid side chains and solvent structure that correlate with this inhibitor conformation. Also included are one set (id=3) of side chains with alternate conform- ations when the conformations are not correlated with the inhibitor conformation. ; 'Ensemble 2-B' ; The inhibitor binds to the enzyme in two, roughly twofold symmetric alternate conformations. This conformational ensemble includes the less populated conformation of the inhibitor (id=2) and the amino acid side chains and solvent structure that correlate with this inhibitor conformation. Also included are one set (id=4) of side chains with alternate conform- ations when the conformations are not correlated with the inhibitor conformation. ; loop_ _atom_sites_alt_gen_ens_id _atom_sites_alt_gen_alt_id 'Ensemble 1-A' '?' 'Ensemble 1-A' '1' 'Ensemble 1-A' '3' 'Ensemble 1-B' '?' 'Ensemble 1-B' '1' 'Ensemble 1-B' '4' 'Ensemble 2-A' '?' 'Ensemble 2-A' '2' 'Ensemble 2-A' '3' 'Ensemble 2-B' '?' 'Ensemble 2-B' '2' 'Ensemble 2-B' '4' loop_ _atom_sites_footnote_id _atom_sites_footnote_text 1 ; The inhibitor binds to the enzyme in two alternate orientations. The two orientations have been assigned alternate location indicators *1* and *2*. ; 2 ; Side chains of these residues adopt alternate orientations that corre- late with the alternate orientations of the inhibitor. Side chains with alternate location indicator *1* and occupancy 0.58 correlate with inhibitor orientation *1*. Side chains with alternate location indicator *2* and occupancy 0.42 correlate with inhibitor orientation *2*. ; 3 ; The positions of these water molecules correlate with the alternate orientations of the inhibitor. Water molecules with alternate location indicator *1* and occupancy 0.58 correlate with inhibitor orientation *1*. Water molecules with alternate location indicator *2* and occupancy 0.42 correlate with inhibitor orientation *2*. ; 4 ; Side chains of these residues adopt alternate orientations that do not correlate with the alternate orientation of the inhibitor. ; 5 ; The positions of these water molecules correlate with alternate orien- tations of amino acid side chains that do not correlate with alternate orientations of the inhibitor. ; loop_ _atom_type_symbol _atom_type_oxidation_number _atom_type_scat_Cromer_Mann_a1 _atom_type_scat_Cromer_Mann_a2 _atom_type_scat_Cromer_Mann_a3 _atom_type_scat_Cromer_Mann_a4 _atom_type_scat_Cromer_Mann_b1 _atom_type_scat_Cromer_Mann_b2 _atom_type_scat_Cromer_Mann_b3 _atom_type_scat_Cromer_Mann_b4 _atom_type_scat_Cromer_Mann_c C 0 2.31000 20.8439 1.02000 10.2075 1.58860 0.568700 0.865000 51.6512 0.21560 N 0 12.2126 0.005700 3.13220 9.89330 2.01250 28.9975 1.16630 0.582600 -11.529 O 0 3.04850 13.2771 2.28680 5.70110 1.54630 0.323900 0.867000 32.9089 0.250800 S 0 6.90530 1.46790 5.20340 22.2151 1.43790 0.253600 1.58630 56.1720 0.866900 CL -1 18.2915 0.006600 7.20840 1.17170 6.53370 19.5424 2.33860 60.4486 -16.378 _audit_creation_date '92-12-08' _audit_creation_method ; Created by hand from PDB entry 5HVP, from the JBC paper describing this structure and from laboratory records ; _audit_update_record ; 92-12-09 adjusted to reflect comments from Brian McKeever 92-12-10 adjusted to reflect comments from Helen Berman 92-12-12 adjusted to reflect comments from Keith Watenpaugh ; loop_ _audit_author_name _audit_author_address 'Fitzgerald, Paula M.D.' ; Department of Biophysical Chemistry Merck Research Laboratories P. O. Box 2000, Ry80M203 Rahway, New Jersey 07065 USA ; 'McKeever, Brian M.' ; Department of Biophysical Chemistry Merck Research Laboratories P. O. Box 2000, Ry80M203 Rahway, New Jersey 07065 USA ; 'Van Middlesworth, J.F.' ; Department of Biophysical Chemistry Merck Research Laboratories P. O. Box 2000, Ry80M203 Rahway, New Jersey 07065 USA ; 'Springer, James P.' ; Department of Biophysical Chemistry Merck Research Laboratories P. O. Box 2000, Ry80M203 Rahway, New Jersey 07065 USA ; _audit_contact_author_name 'Fitzgerald, Paula M.D.' _audit_contact_author_address ; Department of Biophysical Chemistry Merck Research Laboratories P. O. Box 2000, Ry80M203 Rahway, New Jersey 07065 USA ; _audit_contact_author_phone '908 594 5510' _audit_contact_author_fax '908 594 6645' _audit_contact_author_email 'paula_fitzgerald@merck.com' _cell_length_a 58.39(5) _cell_length_b 86.70(12) _cell_length_c 46.27(6) _cell_angle_alpha 90.00 _cell_angle_beta 90.00 _cell_angle_gamma 90.00 _cell_volume 234237 _cell_special_details ; The cell parameters were refined every twenty frames during data integra- tion. The cell lengths given are the mean of 55 such refinements; the esds given are the root mean square deviations of these 55 observations from that mean. ; _cell_measurement_temperature 293(3) _cell_measurement_theta_min 11 _cell_measurement_theta_max 31 _cell_measurement_wavelength 1.54 loop_ _citation_id _citation_coordinate_linkage _citation_title _citation_country _citation_page_first _citation_page_last _citation_year _citation_journal_abbrev _citation_journal_volume _citation_journal_issue _citation_journal_coden_ASTM _citation_journal_coden_ISSN _citation_journal_coden_PDB _citation_book_title _citation_book_publisher _citation_book_coden_ISBN _citation_special_details primary yes ; Crystallographic analysis of a complex between human immunodeficiency virus type 1 protease and acetyl-pepstatin at 2.0-Angstroms resolution. ; US 14209 14219 1990 'J. Biol. Chem.' 265 ? HBCHA3 0021-9258 071 ? ? ? ; The publication that directly relates to this coordinate set. ; 2 no ; Three-dimensional structure of aspartyl-protease from human immunodeficiency virus HIV-1. ; UK 615 619 1989 'Nature' 337 ? NATUAS 0028-0836 006 ? ? ? ; Determination of the structure of the unliganded enzyme. ; 3 no ; Crystallization of the aspartylprotease from human immunodeficiency virus, HIV-1. ; US 1919 1921 1989 'J. Biol. Chem.' 264 ? HBCHA3 0021-9258 071 ? ? ? ; Crystallization of the unliganded enzyme. ; 4 no ; Human immunodeficiency virus protease. Bacterial expression and characterization of the purified aspartic protease. ; US 2307 2312 1989 'J. Biol. Chem.' 264 ? HBCHA3 0021-9258 071 ? ? ? ; Expression and purification of the enzyme. ; loop_ _citation_author_citation_id _citation_author_name primary 'Fitzgerald, P.M.D.' primary 'McKeever, B.M.' primary 'Van Middlesworth, J.F.' primary 'Springer, J.P.' primary 'Heimbach, J.C.' primary 'Leu, C.-T.' primary 'Herber, W.K.' primary 'Dixon, R.A.F.' primary 'Darke, P.L.' 2 'Navia, M.A.' 2 'Fitzgerald, P.M.D.' 2 'McKeever, B.M.' 2 'Leu, C.-T.' 2 'Heimbach, J.C.' 2 'Herber, W.K.' 2 'Sigal, I.S.' 2 'Darke, P.L.' 2 'Springer, J.P.' 3 'McKeever, B.M.' 3 'Navia, M.A.' 3 'Fitzgerald, P.M.D.' 3 'Springer, J.P.' 3 'Leu, C.-T.' 3 'Heimbach, J.C.' 3 'Herber, W.K.' 3 'Sigal, I.S.' 3 'Darke, P.L.' 4 'Darke, P.L.' 4 'Leu, C.-T.' 4 'Davis, L.J.' 4 'Heimbach, J.C.' 4 'Diehl, R.E.' 4 'Hill, W.S.' 4 'Dixon, R.A.F.' 4 'Sigal, I.S.' _computing_data_collection 'Collect (Siemens)' _computing_data_reduction 'Xengen (Howard)' _computing_phasing_MR 'Merlot (Fitzgerald)' _computing_molecular_graphics 'Protein (Steigemann), Frodo (Jones)' _computing_structure_refinement 'Protin/Prolsq (Konnert, Hendrickson)' _database_code_PDB 5HVP loop_ _database_remark_num_PDB _database_remark_text_PDB 1 REMARK 2 2 REMARK 2 RESOLUTION. 2.0 ANGSTROMS. 3 REMARK 3 4 REMARK 3 REFINEMENT. BY THE RESTRAINED LEAST-SQUARES PROCEDURE OF J. 5 REMARK 3 KONNERT AND W. HENDRICKSON (PROGRAM *PROLSQ*). THE R 6 REMARK 3 VALUE IS 0.176 FOR 12901 REFLECTIONS IN THE RESOLUTION 7 REMARK 3 RANGE 8.0 TO 2.0 ANGSTROMS WITH I .GT. SIGMA(I). # - - - - data truncated for brevity - - - - loop_ _database_rev_num_PDB _database_rev_author_name_PDB _database_rev_date_PDB _database_rev_date_original_PDB _database_rev_status_PDB _database_rev_mod_type_PDB 1 'Fitzgerald, Paula M.D' 91-10-15 90-04-30 'full release' 0 _diffrn_ambient_temperature 293(3) _diffrn_crystal_environment ; Mother liquor from the reservoir of the vapor diffusion experiment, mounted in room air ; _diffrn_crystal_physical_device ; 0.7 mm glass capillary, sealed with dental wax ; _diffrn_crystal_treatment ; Equilibrated in rotating anode radiation enclosure for 18 hours prior to beginning of data collection. ; _diffrn_measurement_method 'omega scan' _diffrn_measurement_details ; 440 frames, 0.20 degrees, 150 sec, detector distance 12 cm, detector angle 22.5 degrees ; _diffrn_measure_device_type '3-circle camera' _diffrn_measure_device_part 'Supper model x' _diffrn_measure_device_details 'none' _diffrn_radiation_collimation '0.3 mm double pinhole' _diffrn_radiation_monochromator graphite _diffrn_radiation_type 'Cu K\a' _diffrn_radiation_wavelength 1.54 _diffrn_rad_detector_type 'multiwire' _diffrn_rad_detector_part 'Siemens' _diffrn_rad_source_type 'rotating anode' _diffrn_rad_source_part 'Rigaku RU-200' _diffrn_rad_source_power '50 kw, 180 mA' _diffrn_rad_source_target '8mm x 0.4 mm broad-focus' loop_ _entity_id _entity_type _entity_name_common _entity_name_systematic _entity_source _entity_special_details 1 polymer 'HIV-1 protease' ECx.x.x.x ; Clone obtained from HIV strain NY-5. Expressed in E. coli. ; ; The enzymatically competent form of HIV protease is a dimer. This entity corresponds to one monomer of an active dimer. ; 2 non-polymer 'acetyl-pepstatin' 'acetyl-Ile-Val-Asp-Sta-Ala-Ile-Sta' 'Natural product isolated from actinomycetes' ; Statine: ((4S,3S)-4-amino-3-hydroxy-6-methylheptanoic acid. Acetyl-pepstatin was isolated by Dr. K. Oda, Osaka Prefecture University, and provided to us by Dr. Ben Dunn, University of Florida, and Dr. J. Kay, University of Wales. ; 3 water 'water' ? ? ? loop_ _entity_keywords_entity_id _entity_keywords_text 1 'polypeptide' 2 'natural product' 2 'inhibitor' 2 'reduced peptide' loop_ _entity_nonp_id _entity_nonp_entity_id _entity_nonp_formula _entity_nonp_formula_weight _entity_nonp_number_of_nh_atoms _entity_nonp_model_source _entity_nonp_model_details APS 2 'C31 H55 N5 O9' 641.8 45 'Built by hand using ChemNote in Quanta (MSI)' 'Geometry idealized using AMF (Merck)' loop_ _entity_nonp_atom_entity_id _entity_nonp_atom_atom_id _entity_nonp_atom_type_symbol _entity_nonp_atom_model_Cartn_x _entity_nonp_atom_model_Cartn_y _entity_nonp_atom_model_Cartn_z 2 1 C -0.15600 -0.90770 -2.11270 2 2 C -0.20530 -1.10010 -0.59490 2 3 O -0.51270 -2.16520 -0.06340 2 4 N 0.09550 -0.00790 0.11530 2 5 C 0.14840 -0.01830 1.58870 2 6 C 1.41550 -0.79710 2.04770 2 7 C 2.71100 -0.17870 1.47350 2 8 C 1.50570 -0.94000 3.58320 2 9 C 0.20050 1.42100 2.12200 2 10 O 0.58080 2.37350 1.43910 2 11 N -0.15500 1.55910 3.40030 # - - - - data truncated for brevity - - - - loop_ _entity_nonp_bond_entity_id _entity_nonp_bond_atom_id_1 _entity_nonp_bond_atom_id_2 _entity_nonp_bond_type 2 1 2 sing 2 2 3 doub 2 2 4 sing 2 4 5 sing 2 5 6 sing 2 5 9 sing 2 6 7 sing 2 6 8 sing 2 9 10 doub 2 9 11 sing # - - - - data truncated for brevity - - - - loop_ _entity_poly_entity_id _entity_poly_type _entity_poly_formula_weight _entity_poly_non_s_chirality _entity_poly_non_s_linkage _entity_poly_non_s_monomer _entity_poly_type_details 1 polypeptide(L) 10916 no no no ? loop_ _entity_poly_seq_entity_id _entity_poly_seq_num _entity_poly_seq_mon_id 1 1 PRO 1 2 GLN 1 3 ILE 1 4 THR 1 5 LEU 1 6 TRP 1 7 GLN 1 8 ARG 1 9 PRO 1 10 LEU 1 11 VAL 1 12 THR 1 13 ILE 1 14 LYS 1 15 ILE 1 16 GLY 1 17 GLY 1 18 GLN 1 19 LEU 1 20 LYS 1 21 GLU 1 22 ALA 1 23 LEU 1 24 LEU 1 25 ASP # - - - - data truncated for brevity - - - - _exptl_crystal_grow_method 'hanging drop' _exptl_crystal_grow_apparatus 'Linbro plates' _exptl_crystal_grow_atmosphere 'room air' _exptl_crystal_grow_pH 4.7 _exptl_crystal_grow_temp 18(3) _exptl_crystal_grow_time 'approximately 2 days' loop_ _exptl_crystal_grow_com_id _exptl_crystal_grow_com_sol_id _exptl_crystal_grow_com_name _exptl_crystal_grow_com_volume _exptl_crystal_grow_com_conc _exptl_crystal_grow_com_details 1 1 'HIV-1 protease' '0.002 ml' '6 mg/ml' ; The protein solution was in a buffer containing 25 mM NaCl, 100 mM NaMES/ MES buffer, pH 7.5, 3 mM NaAzide ; 2 2 'NaCl' '0.200 ml' '4 M' 'in 3 mM NaAzide' 3 2 'Acetic Acid' '0.047 ml' '100 mM' 'in 3 mM NaAzide' 4 2 'Na Acetate' '0.053 ml' '100 mM' ; in 3 mM NaAzide. Buffer components were mixed to produce a pH of 4.7 according to a ratio calculated from the pKa. The actual pH of solution 2 was not measured. ; 5 2 'water' '0.700 ml' 'neat' 'in 3 mM NaAzide' _refine_ls_number_reflns 12901 _refine_ls_number_restraints 6609 _refine_ls_number_parameters 7032 _refine_ls_R_Factor_obs 0.176 _refine_ls_weighting_scheme calc _refine_ls_weighting_details ; Sigdel model of Konnert-Hendrickson: Sigdel: Afsig + Bfsig*(sin(theta)/lambda-1/6) Afsig = 22.0, Bfsig = -150.0 at the beginning of refinement. Afsig = 15.5, Bfsig = -50.0 at the end of refinement. ; loop_ _refine_ls_restr_type _refine_ls_restr_target _refine_ls_restr_model _refine_ls_restr_number _refine_ls_restr_criterion _refine_ls_restr_rejects 'bond_d' 0.020 0.018 1654 '> 2\s' 22 'angle_d' 0.030 0.038 2246 '> 2\s' 139 'planar_d' 0.040 0.043 498 '> 2\s' 21 'planar' 0.020 0.015 270 '> 2\s' 1 'chiral' 0.150 0.177 278 '> 2\s' 2 'singtor_nbd' 0.500 0.216 582 '> 2\s' 0 'multtor_nbd' 0.500 0.207 419 '> 2\s' 0 'xyhbond_nbd' 0.500 0.245 149 '> 2\s' 0 'planar_tor' 3.0 2.6 203 '> 2\s' 9 'staggered_tor' 15.0 17.4 298 '> 2\s' 31 'orthonormal_tor' 20.0 18.1 12 '> 2\s' 1 loop_ _refine_ls_shell_d_res_low _refine_ls_shell_d_res_high _refine_ls_shell_reflns _refine_ls_shell_R_factor_obs 8.00 4.51 1226 0.196 4.51 3.48 1679 0.146 3.48 2.94 2014 0.160 2.94 2.59 2147 0.182 2.59 2.34 2127 0.193 2.34 2.15 2061 0.203 2.15 2.00 1647 0.188 loop_ _refine_occupancy_class _refine_occupancy_treatment _refine_occupancy_value _refine_occupancy_details 'protein' fix 1.00 ? 'solvent' fix 1.00 ? 'inhibitor orientation 1' fix 0.65 ? 'inhibitor orientation 2' fix 0.35 ; The inhibitor binds to the enzyme in two alternate conformations. The occupancy of each conformation was adjusted so as to result in approxi- mately equal mean thermal factors for the atoms in each conformation. ; loop_ _refine_B_iso_class _refine_B_iso_treatment 'protein' isotropic 'solvent' isotropic 'inhibitor' isotropic _reflns_data_reduction_method ; Xengen program scalei. Anomalous paris were merged. Scaling proceeded in several passes, beginning with 1-parameter fit and ending with 3-parameter fit. ; _reflns_data_reduction_details ; Merging and scaling based on only those reflections with I > \s(I). ; _reflns_d_resolution_high 2.00 _reflns_d_resolution_low 8.00 _reflns_limit_h_max 22 _reflns_limit_h_min 0 _reflns_limit_k_max 46 _reflns_limit_k_min 0 _reflns_limit_l_max 57 _reflns_limit_l_min 0 _reflns_number_observed 7228 _reflns_observed_criterion '> 1 \s(I)' _reflns_special_details none loop_ _reflns_shell_d_res_high _reflns_shell_d_res_low _reflns_shell_meanI/sigI_obs _reflns_shell_count_measured_obs _reflns_shell_count_unique_obs _reflns_shell_possible_%_obs _reflns_shell_Rmerge_F_obs 31.38 3.82 69.8 9024 2540 96.8 1.98 3.82 3.03 26.1 7413 2364 95.1 3.85 3.03 2.65 10.5 5640 2123 86.2 6.37 2.65 2.41 6.4 4322 1882 76.8 8.01 2.41 2.23 4.3 3247 1714 70.4 9.86 2.23 2.10 3.1 1140 812 33.3 13.99 _struct_title ; HIV-1 protease complex with acetyl-pepstatin ; loop_ _struct_keywords 'enzyme-inhibitor complex' 'aspartyl protease' 'structure-based drug design' 'static disorder' loop_ _struct_asym_id _struct_asym_entity_id _struct_asym_special_details A 1 'one monomer of the dimeric enzyme' B 1 'one monomer of the dimeric enzyme' C 2 'one partially occupied position for the inhibitor' D 2 'one partially occupied position for the inhibitor' loop_ _struct_biol_id _struct_biol_special_details 1 ; significant deviations from twofold symmetry exist in this dimeric enzyme ; 2 ; The drug binds to this enzyme in two roughly twofold symmetric modes. Hence this biological unit (2) is roughly twofold symmetric to biological unit (3). Disorder in the protein chain indicated with alternate indicator 1 should be used with this biological unit. ; 3 ; The drug binds to this enzyme in two roughly twofold symmetric modes. Hence this biological unit (3) is roughly twofold symmetric to biological unit (2). Disorder in the protein chain indicated with alternate indicator 2 should be used with this biological unit. ; loop_ _struct_biol_gen_biol_id _struct_biol_gen_asym_id _struct_biol_gen_symmetry 1 A 1_555 1 B 1_555 2 A 1_555 2 B 1_555 2 C 1_555 3 A 1_555 3 B 1_555 3 D 1_555 loop_ _struct_biol_keywords_biol_id _struct_biol_keywords_text 1 'aspartyl-protease' 1 'aspartic-protease' 1 'acid-protease' 1 'aspartyl-proteinase' 1 'aspartic-proteinase' 1 'acid-proteinase' 1 'enzyme' 1 'protease' 1 'proteinase' 1 'dimer' 2 'drug-enzyme complex' 2 'inhibitor-enzyme complex' 2 'drug-protease complex' 2 'inhibitor-protease complex' 3 'drug-enzyme complex' 3 'inhibitor-enzyme complex' 3 'drug-protease complex' 3 'inhibitor-protease complex' loop_ _struct_conf_id _struct_conf_conf_type_id _struct_conf_beg_label_res_id _struct_conf_beg_label_asym_id _struct_conf_beg_label_seq_id _struct_conf_end_label_res_id _struct_conf_end_label_asym_id _struct_conf_end_label_seq_id _struct_conf_special_details HELX1 HELX-RHAL ARG A 87 GLN A 92 ? HELX2 HELX-RHAL ARG B 287 GLN B 292 ? STRN1 STRN PRO A 1 LEU A 5 ? STRN2 STRN CYS B 295 PHE B 299 ? STRN3 STRN CYS A 95 PHE A 299 ? STRN4 STRN PRO B 201 LEU B 205 ? # - - - - data truncated for brevity - - - - TURN1 TURN-TY1P ILE A 15 GLN A 18 ? TURN2 TURN-TY2 GLY A 49 GLY A 52 ? TURN3 TURN-TY1P ILE A 55 HIS A 69 ? TURN4 TURN-TY1 THR A 91 GLY A 94 ? # - - - - data truncated for brevity - - - - loop_ _struct_conf_type_id _struct_conf_type_criteria _struct_conf_type_reference HELX-RHAL 'author judgement' ? STRN 'author judgement' ? TURN-TY1 'author judgement' ? TURN-TY1P 'author judgement' ? TURN-TY2 'author judgement' ? TURN-TY2P 'author judgement' ? loop_ _struct_conn_id _struct_conn_conn_type_id _struct_conn_par1_label_res_id _struct_conn_par1_label_asym_id _struct_conn_par1_label_seq_id _struct_conn_par1_label_atom_id _struct_conn_role_par1 _struct_conn_symmetry_par1 _struct_conn_par2_label_res_id _struct_conn_par2_label_asym_id _struct_conn_par2_label_seq_id _struct_conn_par2_label_atom_id _struct_conn_role_par2 _struct_conn_symmetry_par2 _struct_conn_special_details C1 saltbr ARG A 87 NZ1 positive 1_555 GLU A 92 OE1 negative 1_555 ? C2 hydrog ARG B 287 N donor 1_555 GLY B 292 O acceptor 1_555 ? # - - - - data truncated for brevity - - - - loop_ _struct_conn_type_id _struct_conn_type_criteria _struct_conn_type_reference saltbr 'negative to positive distance > 2.5 \%A, < 3.2 \&A' ? hydrog 'N to O distance > 2.5 \%A, < 3.5 \&A, N O C angle < 120 degrees' ? loop_ _struct_site_id _struct_site_special_details 'P2 site C' ; residues with a contact < 3.7 \%A to an atom in the P2 moiety of the inhibitor in the conformation with _struct_asym_id = C ; 'P2 site D' ; residues with a contact < 3.7 \%A to an atom in the P1 moiety of the inhibitor in the conformation with _struct_asym_id = D) ; loop_ _struct_site_gen_id _struct_site_gen_site_id _struct_site_gen_label_res_id _struct_site_gen_label_asym_id _struct_site_gen_label_seq_id _struct_site_gen_symmetry _struct_site_gen_special_details 1 1 VAL A 32 1_555 ? 2 1 ILE A 47 1_555 ? 3 1 VAL A 82 1_555 ? 4 1 ILE A 84 1_555 ? 5 2 VAL B 232 1_555 ? 6 2 ILE B 247 1_555 ? 7 2 VAL B 282 1_555 ? 8 2 ILE B 284 1_555 ? loop_ _struct_site_keywords_site_id _struct_site_keywords_text 'P2 site C' 'binding site' 'P2 site C' 'binding pocket' 'P2 site C' 'P2 site' 'P2 site C' 'P2 pocket' 'P2 site D' 'binding site' 'P2 site D' 'binding pocket' 'P2 site D' 'P2 site' 'P2 site D' 'P2 pocket' _symmetry_cell_setting orthorhombic _symmetry_Int_Tables_number 18 _symmetry_space_group_name_H-M 'P 21 21 2' loop_ _symmetry_equiv_pos_as_xyz +x,+y,+z -x,-y,z 1/2+x,1/2-y,-z 1/2-x,1/2+y,-z