This is an archive copy of the IUCr web site dating from 2008. For current content please visit https://www.iucr.org.
[IUCr Home Page] [CIF Home Page] [mmCIF Home Page]

Re: secondary structure

Paula Fitzgerald (paula_fitzgerald@Merck.Com)
Thu, 5 Oct 95 14:19:54 EDT


Frances Bernstein writes:

>     I have questions about the mmCIF list of allowed values for
> _struct_conf.conf_type_id and how to fit existing PDB information
> into these categories.
> 
>     For helices, a number of classes are defined in mmCIF and there is a
> class HELX-OTHR 'helix other (protein)'.  However, some PBD entries have
> HELIX records that are classified as alpha with notes such as kinked,
> bent, alpha but 3/10 in part, etc.  Should such helices fall into the
> category HELX-RHAL 'right-handed alpha helix (protein)' or HELX-OTHR?
> 
>     For turns there are several types given in mmCIF and there is TURN-OTHR
> 'turn other (protein)'.  Most turns in PDB entries are not classified.
> It does not seem appropriate to list these turns as TURN-OTHR; is it
> expected that the type will be computed from the coordinates?
> 
>     I suppose that one could make the general comment here that the
> geometry of secondary structures is not always as regular as people who
> want to classify it would like it to be.
> 
>     A general question which arises is whether categories should be added
> for 'unspecified' or 'unknown' which clearly have a different meaning than
> 'other'.

Fran raises a number of very relevant issues in this message, and I have
attempted to respond to most of the them by rewriting the enumeration list
for _struct_conf.conf_type_id.  As you will see in that list, which I include
below, I have added, for each major category of backbone conformation, types
of not-specified and other.  Not-specified is self-explanatory;  other means
that the conformation type has been examined, and it does not conform to an
accepted category.  For not-specified I use no abbreviation, for other I
use OT.

In the process of making these changes, I have changed the format of the
types a bit, to be more consistent and to be more CIF-like (for instance,
adding another separator between handedness and type, and making the separators
underbars instead of hyphens).

I'm still not totally happy with this, but I think we are getting closer to
something useable.  What makes me particularly unhappy ties in with Fran's
question of whether we call a helix that is fundamentally alpha but that
has a kink in it alpha or other.  We have provided a mechanism whereby the
author can say what criteria he/she used to call something a helix (or
whatever) and those criteria can be either algorithmic or judgement-based.
This can all be found in _struct_conf_type.criteria.

But it seems to me we really need to provide for mulitple criteria, so that
a structure could be analysed *both* algorithmically and in a judgement-based
way.  I've given some thought to how to do this, but I'm not sure which what
is the best way to proceed.  Comments are of course welcome.

Here is the way the list looks now:


    _item_enumeration.detail      HELX_P
;                                 helix with handedness and type not specified
                                  (protein)
;
                                  HELX_OT_P
;                                 helix with handedness and type that do not
                                  conform to an accepted category (protein)
;
#
                                  HELX_RH_P
;                                 right-handed helix with type not specified
                                  (protein)
;
                                  HELX_RH_OT_P
;                                 right-handed helix with type that does not
                                  conform to an accepted category (protein)
;
                                  HELX_RH_AL_P
                                 'right-handed alpha helix (protein)'
                                  HELX_RH_GA_P
                                 'right-handed gamma helix (protein)'
                                  HELX_RH_OM_P
                                 'right-handed omega helix (protein)'
                                  HELX_RH_PI_P
                                 'right-handed pi helix (protein)'
                                  HELX_RH_3T_P
                                 'right-handad 3-10 helix (protein)'
                                  HELX_RH_PP_P
                                 'right-handed polyproline helix (protein)'
#
                                  HELX_LH_P
;                                 left-handed helix with type not specified
                                  (protein)
;
                                  HELX_LH_OT_P
;                                 left-handed helix with type that does not
                                  conform to an accepted category (protein)
;
                                  HELX_LH_AL_P
                                 'left-handed alpha helix (protein)'
                                  HELX_LH_GA_P
                                 'left-handed gamma helix (protein)'
                                  HELX_LH_OM_P
                                 'left-handed omega helix (protein)'
                                  HELX_LH_PI_P
                                 'left-handed pi helix (protein)'
                                  HELX_LH_3T_P
                                 'left-handed 3-10 helix (protein)'
                                  HELX_LH_PP_P
                                 'left-handed polyproline helix (protein)'
#
                                  HELX_N
;                                 helix with handedness and type not specified
                                  (nucleic acid)
;
                                  HELX_OT_N
;                                 helix with handedness and type that do not
                                  conform to an accepted category (nucleic 
                                  acid)
;
#
                                  HELX_RH_N
;                                 right-handed helix with type not specified
                                  (nucleic acid)
;
                                  HELX_RH_OT_N
;                                 right-handed helix with type that does not
                                  conform to an accepted category (nucleic 
                                  acid)
;
                                  HELX_RH_A_N
                                 'right-handed A helix (nucleic acid)'
                                  HELX_RH_B_N
                                 'right-handed B helix (nucleic acid)'
                                  HELX_RH_Z_N
                                 'right-handed Z helix (nucleic acid)'
#
                                  HELX_LH_N
;                                 left-handed helix with type not specified
                                  (nucleic acid)
;
                                  HELX_LH_OT_N
;                                 left-handed helix with type that does not
                                  conform to an accepted category (nucleic 
                                  acid)
;
                                  HELX_LH_A_N
                                 'left-handed A helix (nucleic acid)'
                                  HELX_LH_B_N
                                 'left-handed B helix (nucleic acid)'
                                  HELX_LH_Z_N
                                 'left-handed Z helix (nucleic acid)'
#
                                  TURN_P
                                 'turn with type not specified (protein)'
                                  TURN_OT_P
;                                 turn with type that does not conform to an
                                  accepted category (protein)
;
                                  TURN_TY1_P
                                 'type I turn (protein)'
                                  TURN_TY1P_P
                                 'type 1 prime turn (protein)'
                                  TURN_TY2_P
                                 'type II turn (protein)'
                                  TURN_TY2P_P
                                 'type II prime turn (protein)'
                                  TURN_TY3_P
                                 'type III turn (protein)'
                                  TURN_TY3P_P
                                 'type III prime turn (protein)'
#
                                  STRN
                                 'beta strand (protein)'

Paula

********************************************************************************
 Dr. Paula M. D. Fitzgerald  ______________ voice and FAX: (908) 594-5510
   Merck Research Laboratories ______________ email: paula_fitzgerald@merck.com
     P.O. Box 2000, Ry50-105     ______________ or bean@merck.com           
       Rahway, NJ 07065  USA 
         (for express mail use 126 E. Lincoln Ave. instead of P. O. Box 2000)  
********************************************************************************