Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Fwd: New DDLm attribute and core dataname to control datanameredefinition

  • To: Distribution list of the IUCr COMCIFS Core Dictionary Maintenance Group <coredmg@iucr.org>
  • Subject: Fwd: New DDLm attribute and core dataname to control datanameredefinition
  • From: James Hester <jamesrhester@gmail.com>
  • Date: Fri, 20 Jan 2017 15:59:04 +1100
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:in-reply-to:references:from:date:message-id:subject:to;bh=HuHDYktW5HxOLWFDHaLED1KDHRrcb22T107OPzOCIYI=;b=ovBG2dC9mKhSMLrHq6hcbUflFJCxiWzV2UCRxEMSK8StoFBYWo4OoFMWfQ550Q3BRpXx5vULylrTNevYp2IhXNcf174OSaqRKXAv9XAGPb8DTYywstJNU64JZ+/aDyNAkFwI7N6AUOJj/Q4wfBbBeL65/6LTjfVAChACKh6AMPYlVkYBmBNdN31rQjYxbbx+r+eeJUZE1JgTLyQMeOUaO7fhYriWocF5GjZOjCJiReJT2ZjcZZTarcaNjFGle/T5VEO690NvQHUs9ndukgZ5yqP6Q0C6VnySnGL6y5NmNaf7PtrjORWYtmrzMwP7mO3lUy7tgTn+Hu1NjdI49IsoUA==
  • In-Reply-To: <CAM+dB2dyvV0H7ehuxi06L6gr3+eqmRjt11PXh7Lw9kmMPcc+6w@mail.gmail.com>
  • References: <CAM+dB2dyvV0H7ehuxi06L6gr3+eqmRjt11PXh7Lw9kmMPcc+6w@mail.gmail.com>
Dear Core CIF DMG,

In a recent post to the COMCIFS group I suggested that we needed to formalise and signpost redefinition of datanames. A proposal as to how this might be done using a couple of new _audit datanames and a new DDLm dictionary attribute is now available at  https://github.com/COMCIFS/comcifs.github.io/blob/master/audit.formalism_proposal.md

The DDLm group, acting as technical consultants, have raised no criticisms of the proposal.  As the _audit datanames are currently contained in the core dictionary, your comments on this proposal are sought within the next few weeks, after which the proposal will be returned to COMCIFS for final approval.

I have also appended the proposal to this post if you would like to insert comments in a reply to this email. Note that the above link is much more nicely formatted.

all the best,

# Proposal for new datanames and attributes to cover differing models

## Introduction

The following proposal implements part of a solution to incorporating
multiple models into CIF, discussed
[here](changing_meanings_discussion_paper.md).  It should be read in
conjunction with that document.

## New datanames: `_audit.formalism` and `_audit.formalism_version`

Each value of `_audit.formalism` corresponds to a particular way of
deriving some set of CIF datanames from other datanames defined in the
same, or imported, dictionaries.  For better interoperability, we
stipulate that datanames may only be redefined by dictionaries if, for
some values of the datanames from which the redefined datanames are
calculated, the derived dataname takes the values that the original
dataname would have taken. So, for example, if `_atom_site.moment` is
zero, `_refln.F_complex` has the same values as in core CIF, so it is
acceptable for a magnetic dictionary to redefine `_refln.F_complex`.

`_audit.formalism_version` is provided to allow secondary parameters
to be added to the model without changing the overall formalism. As a
guide, parameters are considered secondary if they do not require the
addition of new columns to any category, and do not significantly
change the final calculated values in "typical" cases.

All CIF datablocks should include these new datanames when they take
non-default values; the default values correspond to the
single-crystal model described in core CIF.  Most CIF reading programs
should check these datanames in order to avoid miscalculating derived

The choice of the word `formalism` is purely to avoid clashing with
the widespread use of `model` in core CIF to refer to the particular
arrangement of atoms. There may be a better word.  See the appendix
for formal DDLm definitions.

## New DDLm attributes: `_dictionary.formalism` and `_dictionary.formalism_version`

These attributes associate a dictionary with a particular formalism.

## Treatment of current dictionaries

### Modulated structures

The modulated structures dictionary is assigned formalism `modulated`
and redefines `_refln.F_complex`, `_refln.sin_theta_over_lambda` and

### Magnetism

The magnetism dictionary builds on the modulated structures dictionary.
It is assigned formalism `magnetic` and redefines `_refln.F_complex`

### Powder

The powder dictionary calculates structure factors from information
that may be held in a different datablock.  It therefore redefines
`_refln.F_complex`.  `_refln.F_meas` is also redefined as the
determination of this from the powder observations is clearly
different to the way in which it is derived from single-crystal spots,
not least because of pervasive overlap.

Separate formalisms are necessary for each possible combination of
powder with other formalisms, for example `_audit.formalism` can
take values `powder-magnetic`, `powder-modulated` and

### Electron density

The electron density dictionary allows parameterisation of the
electron density around each atom in terms of multipoles.  With
appropriate choice of coefficients this reduces to the spherical atom
model used in core CIF, so is an acceptable redefinition.
`_refln.F_complex` is redefined, and a formalism of `multipole` is

### Constraints and restraints

This dictionary relates only to the method of determination of the
final parameters and therefore does not affect the definitions of
the final datanames.

### Twinning

Twinning does not change the structural model, but it does change the
way of determining `_refln.F_meas` from the observations. A formalism
of `twinning` is assigned, and as for powder separate formalisms need
to be assigned for each distinct structural model.

### Image CIF

ImgCIF relates only to raw data and is not affected by these changes.

### mmCIF

mmCIF is based on the core CIF model and is therefore unaffected by
these changes.

## Treatment of other techniques

### Laue

A Laue experiment measures distinct spots, but each spot is produced
by a distinct wavelength, and spots sometimes overlap.
`_refln.wavelength` therefore becomes an additional key column in
`refln`. This change by itself is easily covered by defining a
different `_audit.schema`.  However, a Laue dictionary must also
redefine `_refln.F_meas` as the extraction of notional observed
intensities will depend on the model for wavelength distribution, and
so we must assign a separate `_audit.formalism`. As for powder and
twinning, there will be a separate `formalism` for each distinct
structural model.

## Discussion

### Mixing and matching not possible

It is tempting to define something like `_audit.technique` to cover
the technique-based differences, so that `_audit.technique` and
`_audit.formalism` could correspond to different dictionaries that
could be mixed and matched. So, instead of a `powder-magnetism`
formalism, there would simply be a `powder` technique combined with a
`magnetism` formalism, with both dictionaries being separately imported
and notionally orthogonal to one another.

However, any `formalism` that adds keys to the `refln` category will
also require the `technique` to be aware of those keys in order to
explain how `_refln.F_meas` is determined.  For example, a powder
experiment on a modulated structure will calculate the `_refln.F_meas`
value differently to a powder experiment on a non-modulated structure
as the calculations of peak position require different numbers of
indices.  Therefore, it is not possible to generally separate the
technique from the structural model, although it may be possible in
particular cases.

### Just use `_audit_conform`?

Core CIF has long provided the `_audit_conform_dict_*` tags to state which
dictionary or dictionaries a datablock conforms to.  This appears almost
as simple as the proposed `_audit.formalism` tag, so the need for a
separate tag may not be apparent.

However, while the `_audit_conform` mechanism must remain the
canonical source of information, the proposed dataname provides a
simplified route to the same information. In order for a CIF reading
program to confirm that none of the dictionaries listed in a CIF block
change any of the definitions relied upon by that program, it must in
general download the stated versions of the dictionary or dictionaries
from the canonical IUCr site, parse and merge them, and then find any
definitions that have (apparently) been replaced.  Compared to this
procedure, the `_audit.formalism` tag is a much simpler way for the
datablock writer to specify to the datablock reader a particular set
of dataname interpretations that may never change.

Notably, the `_audit_conform_*` mechanism is almost never used. As of
May 28, 2016, there were 195 modulated structures in the
Crystallographic Open Database (as determined by the presence of
`Fourier_wave_vector` in a file). Of these, zero had an
`_audit_conform` entry.  We conclude that introduction of
`_audit.formalism` should be accompanied by an education and outreach
program as well.

### Interaction with `_audit.schema`

`_audit.schema` essentially allows fixed parameters to vary. It
is therefore orthogonal to `_audit.formalism`: a given formalism
may have many possible schemas, and many schemas may apply to
multiple formalisms if they share the same parameters.

In other words, a suitably-written program can handle a variety of
schemas for a single formalism without needing to change the way in
which any dataname is calculated, whereas a program must change the way
in which the redefined datanames are calculated if the formalism

# Appendix I: New core definitions

## _audit.formalism


_definition.id       '_audit.formalism'
_name.category_id    audit
_name.object_id      formalism

     The CIF dictionaries listed in _audit.dictionary may redefine
     datanames. _audit.formalism is provided as an efficient
     alternative to parsing and checking those dictionaries. It
     identifies commonly-used sets of meanings for datanames. In
     general, each value taken by _audit.formalism is linked to a
     particular technique and/or structural approach.  The
     dictionaries for the datablock (see _audit.dictionary) must be
     compatible with the value of _audit.formalism.

_type.contents          Text
_type.purpose           State
_type.container         Single
_type.source            Assigned
    Base                'Single crystal model from core CIF'
    Modulated           'Single crystal modulated structure'
    Magnetic            'Single crystal magnetic structure, potentially modulated'
    Powder              'Powder diffraction experiment'
    Twinned             'Twinned crystal using core CIF model'
    Multipole           'Single crystal model with multipole coefficients'
    Laue                'Laue experiment on single crystal'
    Powder-Modulated    'Powder experiment on a modulated structure'
    Powder-Magnetic     'Powder experiment on a modulated magnetic structure'
    Powder-Multipole    'Powder experiment modelled with multipoles'
    Laue-Magnetic       'Laue experiment on magnetic structure'
    Laue-Modulated      'Laue experiment on modulated non-magnetic structure'
    Laue-Multipole      'Laue experiment modelled with multipoles'
    Twinned-Magnetic    'Twinned magnetic single crystal structure'
    Twinned-Modulated   'Twinned modulated single crystal structure'
    Laue-Twinned        'Laue experiment on twinned single crystal'
    Laue-Twinned-Modulated 'Laue experiment on twinned modulated structure'
    Custom              'Examine dictionaries provided in _audit_conform'
    Local               'Locally modified dictionaries. These datafiles should not be distributed'
_enumeration.default    Base

## _audit.formalism_version


_definition.id       '_audit.formalism_version'
_name.category_id    audit
_name.object_id      formalism_version

     The version of the given formalism (see `_audit.formalism`). The version
     number of a formalism is incremented when new model parameters are
     added that do not significantly affect the model values in typical cases.

_type.contents          Text
_type.purpose           State
_type.container         Single
_type.source            Assigned
_enumeration.default    1.0

## _dictionary.formalism


    _definition.id               '_dictionary.formalism'
    _definition.class            Attribute
    _definition.update           2016-12-17

     The value of this attribute is associated with the set of
     dataname meanings contained in this dictionary.

    _name.category_id            dictionary
    _name.object_id              formalism
    _type.purpose                Audit
    _type.source                 Assigned
    _type.container              Single
    _type.contents               Text


# Appendix II: a full hybrid dictionary

A complete dictionary using the above mechanisms is presented below.

#                                                  #
#    Dictionary for modulated powder diffraction   #
#                                                  #


_dictionary.title             MODPOW
_dictionary.formalism         Powder-Modulated
_dictionary.class             Instance
_dictionary.version           1.0
_dictionary.date              2016-12-19
_dictionary.ddl_conformance   3.12
_dictionary.namespace         MODPOW

    The modulated powder diffraction dictionary redefines datanames
    for use when presenting the results of a powder diffraction
    experiment using a modulated structure model.  The remainder of the
    relevant definitions are found in the modulated structures
    dictionary and the powder diffraction dictionary.


    _definition.id       MODPOW_GROUP
    _definition.scope    Category
    _definition.class    Head
    _definition.update   2016-12-19
    This category is the parent category for all definitions
    in the MODPOW dictionary

    _name.category_id     MODPOW
    _name.object_id       MODPOW_GROUP
    # The following import reads in and reparents all powder and
    # modulated structure definitions to the MODPOW_GROUP category. As cif_ms is
    # read second, the refln category will have the extra modulation indices
    # defined.
    _import.get           [{"file":"cif_pow.dic" "save":"PD_GROUP" "mode":"Full"}
                           {"file":"cif_ms.dic"  "save":"MS_GROUP" "mode":"Full"}]



_definition.id                          '_refln.F_complex'
_definition.update                      2016-12-19
     The structure factor vector for the reflection calculated from
     the modulated structure given in the datablock identified by
_name.category_id                       refln
_name.object_id                         F_complex
_type.purpose                           Measurand
_type.source                            Derived
_type.container                         Single
_type.contents                          Complex
_enumeration.default                    0.

#  A complete dREL expression for F_complex can be provided here, using all of the
#  parameters provided in the powder and modulated structure dictionaries.


_definition.id                          '_refln.F_meas'
_definition.update                      2016-12-19
     The structure factor amplitude for the modulated reflection based on
     partitioning of each observed powder diffraction intensity between
     contributing reflections in proportion to the model reflection contributions.
_name.category_id                       refln
_name.object_id                         F_meas
_type.purpose                           Measurand
_type.source                            Derived
_type.container                         Single
_type.contents                          Real
_enumeration.default                    0.
# A complete dREL expression for calculating F_meas from an observed powder diffractogram can be given here.

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
coreDMG mailing list

[Send comment to list secretary]
[Reply to list (subscribers only)]