Hello again - Yesterday afternoon and evening, Helen, John and I once again found ourselves gathered in central New Jersey, trying to at last really wrap up this stage of dictionary review and modification. When last we wrote (on January 29th) we had dealt with what we considered the minor issues, and were bringing you all up to date with those changes before tackling the larger and more complicated issues. We began dealing with the larger issues that same afternoon, and continued that effort throughout February. As of today, we feel that we have done what we can with the larger issues, and we are ready to release a new version of the dictionary to you. But not only to you - we would like at this point to open the review process to a larger audience, by posting notices about all of this to some of the major structural newsgroups and mailing lists. But before we do that, we would like to give all of you one more chance to look things over. As we really want to get this moving, we will plan on posting the announcements on March 14 (carefully avoiding the Ides of March). We would also like to draw your attention to the new and improved mmCIF Web page, which will be available in the usual place starting today (March 7). We have reorganized things a bit, added a lot of supporting and introductory material, and in general tried to make this a lot more understandable to the general user. Give it a look. The new version of the dictionary (version 0.8, a major stop forward, is also there). Now, for the nuts and bolts. This message will end with the audit trail, as usual, but each of the major issues that we dealt with deserves a few words of its own. 1) There had been a discussion of alternative nomenclatures for the sequence, residue name, residue number, etc. identifiers in the atom site record. After a lot of discussion, we decided to introduce alternative items for each of these, not just for the residue number, which is what we had done before. These new data items have names like _atom_site.auth_seq_id, completely parallel to the current names like _atom_site.label_seq_id (auth is an abbreviation for author's notation). So a completely independent second set has been introduced throughout the dictionary, providing the user complete freedom to adopt a personal nomenclature. The primary data items, which does has to obey some rules, remains the default and are the mandatory data items; the author's notation data items are always optional. There is one confusing thing about this - in order to be consistent, we changed the meaning of atom_site.label_seq_id to be the one that must run from 1-n (and whose parent is _entity_poly_seq.num (previously _atom_site.entity_seq_num had played this role. We truly hope this does not generate too much confusion - it really was necessary to make a cleaner schema. 2) Matrices and vectors for non-crystallography symmetry have been added to the STRUCT_NCS category. That category group was reorganized a good bit to make the data structure cleaner. 3) There had been a suggestion that we needed an ENTITY_LINK category to handle linkages that occur at the entity level (for instance, a disulfide bond between the A and B chains of insulin.) We began to create such a category, but then realized that was completely redundant with the CHEM_COMP_LINK category, so we stepped back from that idea and simply created a generalized CHEM_LINK category, that can be referred to from both ENTITY and CHEM_COMP. 4) The data type code "char" was changed to "line." The complete set is now text (text of any length), line (limited to one line), and code (limited to one word). 5) We talked at length about the issue of NMR ensembles of structure and how to handle then. Our decision was that the concept of ensembles was already provided for in the ATOM_SITES_ALT category and its subcategories. This may take a bit more justification and explanation, but we are convinced that these data items, which are already present, will handle this whole issue cleanly. 6) A completely new SOFTWARE category has been created, replacing the old COMP_PROG category. 7) Another big item was how to specify segments of structure (or of several structures) in order to point to entries in external databases. It had been suggested that we do this at the entity level, but after a lot of thought we decided that such database references were really annotation of the structure, and that they properly belonged in the STRUCT category group, where all other structure annotation takes place. So we created STRUCT_REF, which we think handles all of the cases that were raised with us. The old ENTITY_REFERENCE categories are gone. There were several other smaller things that got taken care of - they appear in the audit trail, but don't really require discussion here. Again - thank you all so much for all of your suggestions and comments, and for helping to bring the dictionary so close to being a final and complete document. Paula, John, Helen - - - - - - - - - - - - - - 0.7.31 1996-02-12 ; Changes (JDW): + Added data items for _database_pdb_matrix.tvect_matrix[][] and _database_pdb_matrix.tvect_vector[]. + Generalized category CHEM_LINK to handle descriptions of a any type of linkage. Created CHEM_COMP_LINK to describe linkages between components, and ENTITY_LINK to describe linkages between entities (and within entities between nonsequential components). Both CHEM_COMP_LINK and ENTITY_LINK reference the linkage description in the CHEM_LINK_* categories. ; 0.7.32 1996-02-17 ; Changes (JDW): + atom_site.entity_id renamed atom_site.label_entity_id. + atom_site.entity_seq_num deleted. + added items _atom_site.auth_asym_id, _atom_site.auth_atom_id, _atom_site.auth_comp_id, and _atom_site.auth_seq_id. These items provide placeholders for alternative nomenclature that may be used by the author. + Set the parentage for _atom_site.label_seq_id to _entity_poly_seq.num. All components of the atom site label (_atom_site.label_*) are now linked to the mmCIF hierarchical description of structure. The data items in _atom_site.auth_* may be used by authors to provide alternative identifiers in the atom site which conform with the scheme that is used in the publication of the structure. + added category group mm_atom_site_auth_label + added auth_asym_id, auth_atom_id, auth_comp_id, and auth_seq_id child data items to the categories: GEOM_ANGLE,GEOM_BOND, GEOM_CONTACT, STRUCT_CONF, STRUCT_CONN, STRUCT_MON_NUCL, STRUCT_PROT, STRUCT_PROT_CIS, STRUCT_NCS_DOM_GEN, STRUCT_SHEET_HBOND, STRUCT_SHEET_RANGE, and STRUCT_SITE_GEN. ; 0.7.33 1996-02-19 ; Changes (JDW): + Replaced category COMP_PROG with category SOFTWARE supplied by P. Bourne. + Fine tuned some values of _item_type.code. Fixed regular expression for code and ucode. ; 0.7.34 1996-02-20 ; Changes (JDW): + Integrated STRUCT_REF, STRUCT_REF_SEQ and STRUCT_REF_SEQ_DIF from PMDF. + Removed ENTITY_REFERENCE and ENTITY_POLY_SEQ_DIF. + Integrated modified categories STRUCT_NCS_DOM, STRUCT_NCS_DOM_LIM, STRUCT_NCS_ENS, STRUCT_NCS_ENS_GEN, and STRUCT_NCS_OPER from PMDF. + changed _item_type.code's 'char' and 'uchar' to 'line' and 'uline'. ; 0.8.0 1996-03-06 ; Changes (PMDF, HB, JDW): + Added unit type 8pi2_angstroms_squared B anisotropic temperature factors, and added conversion factor for this new unit type in the ITEM_UNITS_CONVERSION category. + Changed _item_type.code for _symmetry_equiv.id to 'code' + Added default value 'no' to _chem_comp.mon_nstd_flag. ; ******************************************************************************** Dr. Paula M. D. Fitzgerald ______________ voice and FAX: (908) 594-5510 Merck Research Laboratories ______________ email: paula_fitzgerald@merck.com P.O. Box 2000, Ry50-105 ______________ or bean@merck.com Rahway, NJ 07065 USA (for express mail use 126 E. Lincoln Ave. instead of P. O. Box 2000) ********************************************************************************