Frances Bernstein writes: > I have questions about the mmCIF list of allowed values for > _struct_conf.conf_type_id and how to fit existing PDB information > into these categories. > > For helices, a number of classes are defined in mmCIF and there is a > class HELX-OTHR 'helix other (protein)'. However, some PBD entries have > HELIX records that are classified as alpha with notes such as kinked, > bent, alpha but 3/10 in part, etc. Should such helices fall into the > category HELX-RHAL 'right-handed alpha helix (protein)' or HELX-OTHR? > > For turns there are several types given in mmCIF and there is TURN-OTHR > 'turn other (protein)'. Most turns in PDB entries are not classified. > It does not seem appropriate to list these turns as TURN-OTHR; is it > expected that the type will be computed from the coordinates? > > I suppose that one could make the general comment here that the > geometry of secondary structures is not always as regular as people who > want to classify it would like it to be. > > A general question which arises is whether categories should be added > for 'unspecified' or 'unknown' which clearly have a different meaning than > 'other'. Fran raises a number of very relevant issues in this message, and I have attempted to respond to most of the them by rewriting the enumeration list for _struct_conf.conf_type_id. As you will see in that list, which I include below, I have added, for each major category of backbone conformation, types of not-specified and other. Not-specified is self-explanatory; other means that the conformation type has been examined, and it does not conform to an accepted category. For not-specified I use no abbreviation, for other I use OT. In the process of making these changes, I have changed the format of the types a bit, to be more consistent and to be more CIF-like (for instance, adding another separator between handedness and type, and making the separators underbars instead of hyphens). I'm still not totally happy with this, but I think we are getting closer to something useable. What makes me particularly unhappy ties in with Fran's question of whether we call a helix that is fundamentally alpha but that has a kink in it alpha or other. We have provided a mechanism whereby the author can say what criteria he/she used to call something a helix (or whatever) and those criteria can be either algorithmic or judgement-based. This can all be found in _struct_conf_type.criteria. But it seems to me we really need to provide for mulitple criteria, so that a structure could be analysed *both* algorithmically and in a judgement-based way. I've given some thought to how to do this, but I'm not sure which what is the best way to proceed. Comments are of course welcome. Here is the way the list looks now: _item_enumeration.detail HELX_P ; helix with handedness and type not specified (protein) ; HELX_OT_P ; helix with handedness and type that do not conform to an accepted category (protein) ; # HELX_RH_P ; right-handed helix with type not specified (protein) ; HELX_RH_OT_P ; right-handed helix with type that does not conform to an accepted category (protein) ; HELX_RH_AL_P 'right-handed alpha helix (protein)' HELX_RH_GA_P 'right-handed gamma helix (protein)' HELX_RH_OM_P 'right-handed omega helix (protein)' HELX_RH_PI_P 'right-handed pi helix (protein)' HELX_RH_3T_P 'right-handad 3-10 helix (protein)' HELX_RH_PP_P 'right-handed polyproline helix (protein)' # HELX_LH_P ; left-handed helix with type not specified (protein) ; HELX_LH_OT_P ; left-handed helix with type that does not conform to an accepted category (protein) ; HELX_LH_AL_P 'left-handed alpha helix (protein)' HELX_LH_GA_P 'left-handed gamma helix (protein)' HELX_LH_OM_P 'left-handed omega helix (protein)' HELX_LH_PI_P 'left-handed pi helix (protein)' HELX_LH_3T_P 'left-handed 3-10 helix (protein)' HELX_LH_PP_P 'left-handed polyproline helix (protein)' # HELX_N ; helix with handedness and type not specified (nucleic acid) ; HELX_OT_N ; helix with handedness and type that do not conform to an accepted category (nucleic acid) ; # HELX_RH_N ; right-handed helix with type not specified (nucleic acid) ; HELX_RH_OT_N ; right-handed helix with type that does not conform to an accepted category (nucleic acid) ; HELX_RH_A_N 'right-handed A helix (nucleic acid)' HELX_RH_B_N 'right-handed B helix (nucleic acid)' HELX_RH_Z_N 'right-handed Z helix (nucleic acid)' # HELX_LH_N ; left-handed helix with type not specified (nucleic acid) ; HELX_LH_OT_N ; left-handed helix with type that does not conform to an accepted category (nucleic acid) ; HELX_LH_A_N 'left-handed A helix (nucleic acid)' HELX_LH_B_N 'left-handed B helix (nucleic acid)' HELX_LH_Z_N 'left-handed Z helix (nucleic acid)' # TURN_P 'turn with type not specified (protein)' TURN_OT_P ; turn with type that does not conform to an accepted category (protein) ; TURN_TY1_P 'type I turn (protein)' TURN_TY1P_P 'type 1 prime turn (protein)' TURN_TY2_P 'type II turn (protein)' TURN_TY2P_P 'type II prime turn (protein)' TURN_TY3_P 'type III turn (protein)' TURN_TY3P_P 'type III prime turn (protein)' # STRN 'beta strand (protein)' Paula ******************************************************************************** Dr. Paula M. D. Fitzgerald ______________ voice and FAX: (908) 594-5510 Merck Research Laboratories ______________ email: paula_fitzgerald@merck.com P.O. Box 2000, Ry50-105 ______________ or bean@merck.com Rahway, NJ 07065 USA (for express mail use 126 E. Lincoln Ave. instead of P. O. Box 2000) ********************************************************************************