Herbert Bernstein writes: >Paula's message says that NMR ensembles of structure are to be handled >by the existing ATOM_SITES_ALT category. I am trying to design code >for this purpose and would appreciate guidance on what is intended. >The specific test case that keeps tripping me up is as follows: > Suppose you have 50 models. Within each model, suppose there >are two regions of microheterogeneity. Within each region suppose >one variant has 4 residues and the other has 5 (just to make sure >we know how to handle ENTITY_POLY_SEQ). Now suppose within each >variant within each region there are four disordered atoms, with >2 alternate positions each. > I have tried dealing with this by structuring >_atom_site.label_alt_id in the form model.variant.altpos, but then >I end up violating the usual rules on sums of occupancies, and, worse, >when I try to generate the ensembles to describe each model, each >variant, and each disordered site, I seem to be getting >combinatorial explosion. > Guidance please. The above example is far too complex to represent the results of an NMR structure study. I also believe that, if such a situation arose, it would be unreasonable to attempt to fit the information described into a single data_block. If more than one chemical entity is known to exist, separate data_blocks should be constructed and populated with the models derived for each chemical entity. A data_block would be constructed for each of the four chemical entities (possible variants in the above example). Each model listed in a data_block loop_ can be given a unique '_atom_site.label_alt_id'. As for the conformational heterogeneity, if you look at a set of NMR models where each model in the set has been derived from a fit of a chemical structure to a set of NMR constraints, you will find that every atom is 'disordered'. In a few cases, it has been found that the derived models group into multiple conformational subsets. The data tags in the ATOM_SITES_ALT category could be used to define which models belong to specific conformational subsets. The term 'atom site occupancy' does not apply well to models derived from NMR data and I do not think crystallographic or numerical rules for summing artificial values should be applied to NMR models. In general, the 'atom occupancy' for any given model is one. If 50 models are included in a file, this does not mean that the atom occupancy is 1/50 for each model. It simply means 50 models fit some criteria for inclusion and if a mega-computer had been available 500 structures might have been included. If a set of models can be grouped into two or more subsets representing different conformations and data are available (observed multiple peaks for a subset of the atoms in the molecule) that can define the fraction of molecules existing in each conformation, one could say that a subset of the atoms in the molecule spend 20% of the time in one conformation and 80% in another. This might be observed for proline isomerization and other slow conformational effects, such as equilibria between a folded and unfolded state. For systems in fast exchange, there also are methods for estimating fractional populations. It would be better to describe these data using data tags from other sections of the dictionary. Regards, Eldon Ulrich