This is an archive copy of the IUCr web site dating from 2008. For current content please visit https://www.iucr.org.
[IUCr Home Page] [CIF Home Page] [mmCIF Home Page]

Re: MODEL guidance please

Eldon Ulrich (elu@gir.nmrfam.wisc.edu)
Wed, 13 Mar 1996 16:22:46 -0600


Herbert Bernstein writes:

>Paula's message says that NMR ensembles of structure are to be handled
>by the existing ATOM_SITES_ALT category.  I am trying to design code
>for this purpose and would appreciate guidance on what is intended.
>The specific test case that keeps tripping me up is as follows:
> Suppose you have 50 models.  Within each model, suppose there
>are two regions of microheterogeneity.  Within each region suppose
>one variant has 4 residues and the other has 5 (just to make sure
>we know how to handle ENTITY_POLY_SEQ).  Now suppose within each
>variant within each region there are four disordered atoms, with
>2 alternate positions each.
> I have tried dealing with this by structuring
>_atom_site.label_alt_id in the form model.variant.altpos, but then
>I end up violating the usual rules on sums of occupancies, and, worse,
>when I try to generate the ensembles to describe each model, each
>variant, and each disordered site, I seem to be getting
>combinatorial explosion.
> Guidance please.

The above example is far too complex to represent the results of an NMR
structure study.  I also believe that, if such a situation arose, it would be
unreasonable to attempt to fit the information described into a single
data_block.  If more than one chemical entity is known to exist, separate
data_blocks should be constructed and populated with the models derived for
each chemical entity.  A data_block would be constructed for each of the four
chemical entities (possible variants in the above example).  Each model listed
in a data_block loop_ can be given a unique '_atom_site.label_alt_id'.

As for the conformational heterogeneity,  if you look at a set of NMR models
where each model in the set has been derived from a fit of a chemical structure
to a set of NMR constraints, you will find that every atom is 'disordered'.  In
a few cases, it has been found that the derived models group into multiple
conformational subsets.  The data tags in the ATOM_SITES_ALT category could be
used to define which models belong to specific conformational subsets.

The term 'atom site occupancy' does not apply well to models derived from NMR
data and I do not think crystallographic or numerical rules for summing
artificial values should be applied to NMR models.  In general, the 'atom
occupancy' for any given model is one.  If 50 models are included in a file,
this does not mean that the atom occupancy is 1/50 for each model.  It simply
means 50 models fit some criteria for inclusion and if a mega-computer had been
 available 500 structures might have been included.  If a set of models can be
grouped into two or more subsets representing different conformations and data
are available (observed multiple peaks for a subset of the atoms in the
molecule) that can define the fraction of molecules existing in each
conformation, one could say that a subset of the atoms in the molecule spend
20% of the time in one conformation and 80% in another.  This might be observed
for proline isomerization and other slow conformational effects, such as
equilibria between a folded and unfolded state.  For systems in fast exchange,
there also are methods for estimating fractional populations. It would be
better to describe these data using data tags from other sections of the
dictionary.

Regards,
Eldon Ulrich