Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] New DDLm attribute and core dataname to controldataname redefinition

  • To: ddlm-group <ddlm-group@iucr.org>
  • Subject: Re: [ddlm-group] New DDLm attribute and core dataname to controldataname redefinition
  • From: James Hester <jamesrhester@gmail.com>
  • Date: Fri, 13 Jan 2017 16:16:44 +1100
  • DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:in-reply-to:references:from:date:message-id:subject:to;bh=Xe5ww0+3kuX5ccR5SPsayhNoE3fuuW0g3/cudpOyq70=;b=r7uWaA84Rq/kLcMP2h8ZEhfqYUc8snTEdVUVYIB2jDHXQ0cBA1OVpSe82u6H27yt6ly4wIJGVPqtFMGPWDxFKpYVlFMbZO3RGKpUJN+zG/r3qLbymcl5LgDnt1FwCQKSMomx/x+/wlZYzEuOctz1q0uWlsE7YlL1EOTTp2gvg5GfvfhpBA8rqcCaecmW1owvSN3EJEaj5MQGSbvRaCyRs/AMHFijHY2RetFmBAJMIhvmfHDPOCVmXjFf7euhuvLrzG4qg35vbakr/njXz/DSz7/srD4wy2KYOWkwNH3EqqZukkaG4CJeX1kosV20X3t3tn8LZfzpZLHWWvhZfeiWYA==
  • In-Reply-To: <CAM+dB2dyvV0H7ehuxi06L6gr3+eqmRjt11PXh7Lw9kmMPcc+6w@mail.gmail.com>
  • References: <CAM+dB2dyvV0H7ehuxi06L6gr3+eqmRjt11PXh7Lw9kmMPcc+6w@mail.gmail.com>
Dear DDLm-group,

Happy New Year!

There having been no comments on my previous email proposing the _audit.formalism tag, I will submit the proposal unaltered to the core_cif group for approval.  Assuming no objections from the core CIF DMG, I will begin the process of updating the relevant dictionaries where necessary. 

all the best,
James.


On 19 December 2016 at 15:33, James Hester <jamesrhester@gmail.com> wrote:
Dear DDLm group,

In a recent post to the COMCIFS group I suggested that we needed to formalise and signpost redefinition of datanames. A proposal as to how this might be done using a couple of new _audit datanames and a new DDLm dictionary attribute is now available at  https://github.com/COMCIFS/comcifs.github.io/blob/master/audit.formalism_proposal.md

I have also appended the proposal to this post if you would like to insert comments, although the above link is much more nicely formatted.

Implementation of the DDLm version of the modulated structures dictionary is, to some extent, dependent on this moving forward.  It would therefore be good if you could comment before the end of the week, if only to say that you would like to contribute to the discussion but need more time.  Then I'll know to at least chase you in the New Year if I haven't heard.

all the best,
James.
--

# Proposal for new datanames and attributes to cover differing models

## Introduction

The following proposal implements part of a solution to incorporating
multiple models into CIF, discussed
[here](changing_meanings_discussion_paper.md).  It should be read in
conjunction with that document.

## New datanames: `_audit.formalism` and `_audit.formalism_version`

Each value of `_audit.formalism` corresponds to a particular way of
deriving some set of CIF datanames from other datanames defined in the
same, or imported, dictionaries.  For better interoperability, we
stipulate that datanames may only be redefined by dictionaries if, for
some values of the datanames from which the redefined datanames are
calculated, the derived dataname takes the values that the original
dataname would have taken. So, for example, if `_atom_site.moment` is
zero, `_refln.F_complex` has the same values as in core CIF, so it is
acceptable for a magnetic dictionary to redefine `_refln.F_complex`.

`_audit.formalism_version` is provided to allow secondary parameters
to be added to the model without changing the overall formalism. As a
guide, parameters are considered secondary if they do not require the
addition of new columns to any category, and do not significantly
change the final calculated values in "typical" cases.

All CIF datablocks should include these new datanames when they take
non-default values; the default values correspond to the
single-crystal model described in core CIF.  Most CIF reading programs
should check these datanames in order to avoid miscalculating derived
values.  

The choice of the word `formalism` is purely to avoid clashing with
the widespread use of `model` in core CIF to refer to the particular
arrangement of atoms. There may be a better word.  See the appendix
for formal DDLm definitions.

## New DDLm attributes: `_dictionary.formalism` and `_dictionary.formalism_version`

These attributes associate a dictionary with a particular formalism.

## Treatment of current dictionaries

### Modulated structures

The modulated structures dictionary is assigned formalism `modulated`
and redefines `_refln.F_complex`, `_refln.sin_theta_over_lambda` and
`_refln.symmetry_multiplicity`.

### Magnetism

The magnetism dictionary builds on the modulated structures dictionary.
It is assigned formalism `magnetic` and redefines `_refln.F_complex`
only.

### Powder

The powder dictionary calculates structure factors from information
that may be held in a different datablock.  It therefore redefines
`_refln.F_complex`.  `_refln.F_meas` is also redefined as the
determination of this from the powder observations is clearly
different to the way in which it is derived from single-crystal spots,
not least because of pervasive overlap.

Separate formalisms are necessary for each possible combination of
powder with other formalisms, for example `_audit.formalism` can
take values `powder-magnetic`, `powder-modulated` and
`powder-multipole`.

### Electron density

The electron density dictionary allows parameterisation of the
electron density around each atom in terms of multipoles.  With
appropriate choice of coefficients this reduces to the spherical atom
model used in core CIF, so is an acceptable redefinition.
`_refln.F_complex` is redefined, and a formalism of `multipole` is
assigned.

### Constraints and restraints

This dictionary relates only to the method of determination of the
final parameters and therefore does not affect the definitions of
the final datanames.

### Twinning

Twinning does not change the structural model, but it does change the
way of determining `_refln.F_meas` from the observations. A formalism
of `twinning` is assigned, and as for powder separate formalisms need
to be assigned for each distinct structural model.

### Image CIF

ImgCIF relates only to raw data and is not affected by these changes.

### mmCIF

mmCIF is based on the core CIF model and is therefore unaffected by
these changes.

## Treatment of other techniques

### Laue

A Laue experiment measures distinct spots, but each spot is produced
by a distinct wavelength, and spots sometimes overlap.
`_refln.wavelength` therefore becomes an additional key column in
`refln`. This change by itself is easily covered by defining a
different `_audit.schema`.  However, a Laue dictionary must also
redefine `_refln.F_meas` as the extraction of notional observed
intensities will depend on the model for wavelength distribution, and
so we must assign a separate `_audit.formalism`. As for powder and
twinning, there will be a separate `formalism` for each distinct
structural model.

## Discussion

### Mixing and matching not possible

It is tempting to define something like `_audit.technique` to cover
the technique-based differences, so that `_audit.technique` and
`_audit.formalism` could correspond to different dictionaries that
could be mixed and matched. So, instead of a `powder-magnetism`
formalism, there would simply be a `powder` technique combined with a
`magnetism` formalism, with both dictionaries being separately imported
and notionally orthogonal to one another.

However, any `formalism` that adds keys to the `refln` category will
also require the `technique` to be aware of those keys in order to
explain how `_refln.F_meas` is determined.  For example, a powder
experiment on a modulated structure will calculate the `_refln.F_meas`
value differently to a powder experiment on a non-modulated structure
as the calculations of peak position require different numbers of
indices.  Therefore, it is not possible to generally separate the
technique from the structural model, although it may be possible in
particular cases.

### Just use `_audit_conform`?

Core CIF has long provided the `_audit_conform_dict_*` tags to state which
dictionary or dictionaries a datablock conforms to.  This appears almost
as simple as the proposed `_audit.formalism` tag, so the need for a
separate tag may not be apparent.

However, while the `_audit_conform` mechanism must remain the
canonical source of information, the proposed dataname provides a
simplified route to the same information. In order for a CIF reading
program to confirm that none of the dictionaries listed in a CIF block
change any of the definitions relied upon by that program, it must in
general download the stated versions of the dictionary or dictionaries
from the canonical IUCr site, parse and merge them, and then find any
definitions that have (apparently) been replaced.  Compared to this
procedure, the `_audit.formalism` tag is a much simpler way for the
datablock writer to specify to the datablock reader a particular set
of dataname interpretations that may never change.

Notably, the `_audit_conform_*` mechanism is almost never used. As of
May 28, 2016, there were 195 modulated structures in the
Crystallographic Open Database (as determined by the presence of
`Fourier_wave_vector` in a file). Of these, zero had an
`_audit_conform` entry.  We conclude that introduction of
`_audit.formalism` should be accompanied by an education and outreach
program as well.

### Interaction with `_audit.schema`

`_audit.schema` essentially allows fixed parameters to vary. It
is therefore orthogonal to `_audit.formalism`: a given formalism
may have many possible schemas, and many schemas may apply to
multiple formalisms if they share the same parameters.

In other words, a suitably-written program can handle a variety of
schemas for a single formalism without needing to change the way in
which any dataname is calculated, whereas a program must change the way
in which the redefined datanames are calculated if the formalism
expands.

# Appendix I: New core definitions

## _audit.formalism

```
save_audit.formalism

_definition.id       '_audit.formalism'
_name.category_id    audit
_name.object_id      formalism
_description.text
;

     The CIF dictionaries listed in _audit.dictionary may redefine
     datanames. _audit.formalism is provided as an efficient
     alternative to parsing and checking those dictionaries. It
     identifies commonly-used sets of meanings for datanames. In
     general, each value taken by _audit.formalism is linked to a
     particular technique and/or structural approach.  The
     dictionaries for the datablock (see _audit.dictionary) must be
     compatible with the value of _audit.formalism.

;
_type.contents          Text
_type.purpose           State
_type.container         Single
_type.source            Assigned
loop_
_enumeration_set.state
_enumeration_set.detail
    Base                'Single crystal model from core CIF'
    Modulated           'Single crystal modulated structure'
    Magnetic            'Single crystal magnetic structure, potentially modulated'
    Powder              'Powder diffraction experiment'
    Twinned             'Twinned crystal using core CIF model'
    Multipole           'Single crystal model with multipole coefficients'
    Laue                'Laue experiment on single crystal'
    Powder-Modulated    'Powder experiment on a modulated structure'
    Powder-Magnetic     'Powder experiment on a modulated magnetic structure'
    Powder-Multipole    'Powder experiment modelled with multipoles'
    Laue-Magnetic       'Laue experiment on magnetic structure'
    Laue-Modulated      'Laue experiment on modulated non-magnetic structure'
    Laue-Multipole      'Laue experiment modelled with multipoles'
    Twinned-Magnetic    'Twinned magnetic single crystal structure'
    Twinned-Modulated   'Twinned modulated single crystal structure'
    Laue-Twinned        'Laue experiment on twinned single crystal'
    Laue-Twinned-Modulated 'Laue experiment on twinned modulated structure'
    Custom              'Examine dictionaries provided in _audit_conform'
    Local               'Locally modified dictionaries. These datafiles should not be distributed'
_enumeration.default    Base
save_
```

## _audit.formalism_version

```
save_audit.formalism_version

_definition.id       '_audit.formalism_version'
_name.category_id    audit
_name.object_id      formalism_version
_description.text
;

     The version of the given formalism (see `_audit.formalism`). The version
     number of a formalism is incremented when new model parameters are
     added that do not significantly affect the model values in typical cases.

;
_type.contents          Text
_type.purpose           State
_type.container         Single
_type.source            Assigned
_enumeration.default    1.0
save_
```

## _dictionary.formalism

```
save_dictionary.formalism

    _definition.id               '_dictionary.formalism'
    _definition.class            Attribute
    _definition.update           2016-12-17
    _description.text                   
;

     The value of this attribute is associated with the set of
     dataname meanings contained in this dictionary.

;
    _name.category_id            dictionary
    _name.object_id              formalism
    _type.purpose                Audit
    _type.source                 Assigned
    _type.container              Single
    _type.contents               Text

save_
```

# Appendix II: a full hybrid dictionary

A complete dictionary using the above mechanisms is presented below.

```
#\#CIF_2.0
####################################################
#                                                  #
#    Dictionary for modulated powder diffraction   #
#                                                  #
####################################################

data_MODPOW

_dictionary.title             MODPOW
_dictionary.formalism         Powder-Modulated
_dictionary.class             Instance
_dictionary.version           1.0
_dictionary.date              2016-12-19
_dictionary.ddl_conformance   3.12
_dictionary.namespace         MODPOW
_description.text
;

    The modulated powder diffraction dictionary redefines datanames
    for use when presenting the results of a powder diffraction
    experiment using a modulated structure model.  The remainder of the
    relevant definitions are found in the modulated structures
    dictionary and the powder diffraction dictionary.
    
;

save_MODPOW_GROUP

    _definition.id       MODPOW_GROUP
    _definition.scope    Category
    _definition.class    Head
    _definition.update   2016-12-19
    _description.text
;
    This category is the parent category for all definitions
    in the MODPOW dictionary
;

    _name.category_id     MODPOW
    _name.object_id       MODPOW_GROUP
    
    # The following import reads in and reparents all powder and
    # modulated structure definitions to the MODPOW_GROUP category. As cif_ms is
    # read second, the refln category will have the extra modulation indices
    # defined.
    
    _import.get           [{"file":"cif_pow.dic" "save":"PD_GROUP" "mode":"Full"}
                           {"file":"cif_ms.dic"  "save":"MS_GROUP" "mode":"Full"}]

save_

save__refln.F_complex

_definition.id                          '_refln.F_complex'
loop_
  _alias.definition_id
         '_refln.F_complex'  
         '_refln_F_complex' 
_definition.update                      2016-12-19
_description.text                       
;
     The structure factor vector for the reflection calculated from
     the modulated structure given in the datablock identified by
     _refln.phase_id
;
_name.category_id                       refln
_name.object_id                         F_complex
_type.purpose                           Measurand
_type.source                            Derived
_type.container                         Single
_type.contents                          Complex
_enumeration.default                    0.

#
#  A complete dREL expression for F_complex can be provided here, using all of the
#  parameters provided in the powder and modulated structure dictionaries.
#
save_

save__refln.F_meas

_definition.id                          '_refln.F_meas'
loop_
  _alias.definition_id
         '_refln.F_meas'     
         '_refln_F_meas' 
_definition.update                      2016-12-19
_description.text                       
;
     The structure factor amplitude for the modulated reflection based on
     partitioning of each observed powder diffraction intensity between
     contributing reflections in proportion to the model reflection contributions.
;
_name.category_id                       refln
_name.object_id                         F_meas
_type.purpose                           Measurand
_type.source                            Derived
_type.container                         Single
_type.contents                          Real
_enumeration.default                    0.
#
# A complete dREL expression for calculating F_meas from an observed powder diffractogram can be given here.
#
save_




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.