Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: COMCIFS approval of proposal to flag dataname redefinition

This seems reasonable to me.

Jim Kaduk

On 2017-03-22 20:38, James Hester wrote:
> Dear COMCIFS,
> 
> As per the my email of last December (see
> http://www.iucr.org/__data/iucr/lists/comcifs-l/msg00736.html) a
> simple mechanism has now been developed to clearly flag the presence
> of redefined datanames in a datafile.  Neither the ddlm-group nor the
> core CIF DMG have raised objections. The full proposal is reproduced
> below, and can be read in a more nicely formatted form at
> https://github.com/COMCIFS/comcifs.github.io/blob/master/audit.formalism_proposal.md
> 
> All comments are welcome. If no objections are received within 3
> weeks, this proposal will be considered approved.
> 
> James Hester (chair).
> 
> ==========================================
> 
> # Proposal for new dataname and attribute to cover differing models
> 
> ## Introduction
> 
> The following proposal implements part of a solution to incorporating
> multiple models into CIF, discussed
> [here](changing_meanings_discussion_paper.md [1]).  It should be read
> in
> conjunction with that document.
> 
> ## New datanames: `_audit.formalism` and `_audit.formalism_version`
> 
> Each value of `_audit.formalism` corresponds to a particular way of
> deriving some set of CIF datanames from other datanames defined in the
> same, or imported, dictionaries.  For better interoperability, we
> stipulate that datanames may only be redefined by dictionaries if the
> redefined datanames take values with the same units and domain as the
> original datanames.  So, for example, `_refln.F_complex` for magnetic
> structures is calculated differently to `_refln.F.complex` for core
> CIF, but has the same units and domain (positive real numbers), so it
> is acceptable for a magnetic dictionary to redefine
> `_refln.F_complex`.
> 
> `_audit.formalism_version` is provided to allow secondary parameters
> to be added to the model without changing the overall formalism. As a
> guide, parameters are considered secondary if they do not require the
> addition of new columns to any category, and do not significantly
> change the final calculated values in "typical" cases.
> 
> All CIF datablocks should include these new datanames when they take
> non-default values; the default values correspond to the
> single-crystal model described in core CIF.  Any CIF reading programs
> that perform calculations should check `_audit.formalism` and
> `_audit.formalism_version`
> datanames in order to avoid miscalculating derived values.
> 
> The choice of the word `formalism` is purely to avoid clashing with
> the widespread use of `model` in core CIF to refer to the particular
> arrangement of atoms. There may be a better word.  See the appendix
> for formal DDLm definitions.
> 
> ## New DDLm attributes: `_dictionary.formalism` and
> `_dictionary.formalism_version`
> 
> These attributes associate a dictionary with a particular formalism.
> 
> ## Treatment of current dictionaries
> 
> ### Modulated structures
> 
> The modulated structures dictionary is assigned formalism `modulated`
> and redefines `_refln.F_complex`, `_refln.sin_theta_over_lambda` and
> `_refln.symmetry_multiplicity`.
> 
> ### Magnetism
> 
> The magnetism dictionary builds on the modulated structures
> dictionary.
> It is assigned formalism `magnetic` and redefines `_refln.F_complex`
> only.
> 
> ### Powder
> 
> The powder dictionary calculates structure factors from information
> that may be held in a different datablock.  It therefore redefines
> `_refln.F_complex`.  `_refln.F_meas` is also redefined as the
> determination of this from the powder observations is markedly
> different to the way in which it is derived from single-crystal spots,
> not least because of pervasive overlap.
> 
> Separate formalisms are necessary for each possible combination of
> powder with other formalisms, for example `_audit.formalism` can
> take values `powder-magnetic`, `powder-modulated` and
> `powder-multipole`.
> 
> ### Electron density
> 
> The electron density dictionary allows parameterisation of the
> electron density around each atom in terms of multipoles.
> `_refln.F_complex` is redefined, and a formalism of `multipole` is
> assigned.
> 
> ### Constraints and restraints
> 
> This dictionary relates only to the method of determination of the
> final parameters and therefore does not affect the definitions of
> the final datanames.
> 
> ### Twinning
> 
> Twinning does not change the structural model, but it may change the
> way `_refln.F_meas` is calculated from the observations. A formalism
> of `twinning` is assigned, and as for powder separate formalisms need
> to be assigned for each distinct structural model.
> 
> ### Image CIF
> 
> ImgCIF relates only to raw data and is not affected by these changes.
> 
> ### mmCIF
> 
> mmCIF is based on the core CIF model and is therefore unaffected by
> these changes.
> 
> ## Treatment of other techniques
> 
> ### Laue
> 
> A Laue experiment measures distinct spots, but each spot is produced
> by a distinct wavelength, and spots sometimes overlap.
> `_refln.wavelength` therefore becomes an additional key column in
> `refln`. This change by itself is easily covered by defining a
> different `_audit.schema`.  However, a Laue dictionary must also
> redefine `_refln.F_meas` as the extraction of notional observed
> intensities will depend on the model for wavelength distribution, and
> so we must assign a separate `_audit.formalism`. As for powder and
> twinning, there will be a separate `formalism` for each distinct
> structural model.
> 
> ## Discussion
> 
> ### Mixing and matching not possible
> 
> It is tempting to define something like `_audit.technique` to cover
> the technique-based differences, so that `_audit.technique` and
> `_audit.formalism` could correspond to different dictionaries that
> could be mixed and matched. So, instead of a `powder-magnetism`
> formalism, there would simply be a `powder` technique combined with a
> `magnetism` formalism, with both dictionaries being separately
> imported
> and notionally orthogonal to one another.
> 
> However, any `formalism` that adds keys to the `refln` category will
> also require the `technique` to be aware of those keys in order to
> explain how `_refln.F_meas` is determined.  For example, a powder
> experiment on a modulated structure will calculate the `_refln.F_meas`
> value differently to a powder experiment on a non-modulated structure
> as the calculations of peak position require different numbers of
> indices.  Therefore, it is not possible to generally separate the
> technique from the structural model, although it may be possible in
> particular cases.
> 
> ### Just use `_audit_conform`?
> 
> Core CIF has long provided the `_audit_conform_dict_*` tags to state
> which
> dictionary or dictionaries a datablock conforms to.  This appears
> almost
> as simple as the proposed `_audit.formalism` tag, so the need for a
> separate tag may not be apparent.
> 
> While the `_audit_conform` mechanism must remain the
> canonical source of information, the proposed dataname provides a
> simplified route to the same information. In order for a CIF reading
> program to confirm that none of the dictionaries listed in a CIF block
> change any of the definitions relied upon by that program, it must in
> general download the stated versions of the dictionary or dictionaries
> from the canonical IUCr site, parse and merge them, and then find any
> definitions that have (apparently) been replaced.  Compared to this
> procedure, the `_audit.formalism` tag is a much simpler way for the
> datablock writer to specify to the datablock reader a particular set
> of dataname interpretations that may never change.
> 
> Note that the `_audit_conform_*` mechanism is almost never used. As of
> May 28, 2016, there were 195 modulated structures in the
> Crystallographic Open Database (as determined by the presence of
> `Fourier_wave_vector` in a file). Of these, zero had an
> `_audit_conform` entry, which in theory would be required to explain
> the adjusted interpretation of the `refln` category and new datanames.
> We conclude that introduction of `_audit.formalism` should be
> accompanied by an education and outreach program as well.
> 
> ### Interaction with `_audit.schema`
> 
> `_audit.schema` essentially allows fixed parameters to vary. It
> is therefore orthogonal to `_audit.formalism`: a given formalism
> may have many possible schemas, and many schemas may apply to
> multiple formalisms if they share the same parameters.
> 
> In other words, a suitably-written program can handle a variety of
> schemas for a single formalism without needing to change the way in
> which any dataname is calculated, whereas a program must change the
> way
> in which the redefined datanames are calculated if the formalism
> changes.
> 
> # Appendix I: New core definitions
> 
> ## _audit.formalism
> 
> ```
> save_audit.formalism
> 
> _definition.id [2]       '_audit.formalism'
> _name.category_id    audit
> _name.object_id      formalism
> _description.text
> ;
> 
>      The CIF dictionaries listed in _audit.dictionary may redefine
>      datanames. _audit.formalism is provided as an efficient
>      alternative to parsing and checking those dictionaries. It
>      identifies commonly-used sets of meanings for datanames. In
>      general, each value taken by _audit.formalism is linked to a
>      particular technique and/or structural approach.  The
>      dictionaries for the datablock (see _audit.dictionary) must be
>      compatible with the value of _audit.formalism.
> 
> ;
> _type.contents          Text
> _type.purpose           State
> _type.container         Single
> _type.source            Assigned
> loop_
> _enumeration_set.state
> _enumeration_set.detail
>     Base                'Single crystal model from core CIF'
>     Modulated           'Single crystal modulated structure'
>     Magnetic            'Single crystal magnetic structure,
> potentially modulated'
>     Powder              'Powder diffraction experiment'
>     Twinned             'Twinned crystal using core CIF model'
>     Multipole           'Single crystal model with multipole
> coefficients'
>     Laue                'Laue experiment on single crystal'
>     Powder-Modulated    'Powder experiment on a modulated structure'
>     Powder-Magnetic     'Powder experiment on a modulated magnetic
> structure'
>     Powder-Multipole    'Powder experiment modelled with multipoles'
>     Laue-Magnetic       'Laue experiment on magnetic structure'
>     Laue-Modulated      'Laue experiment on modulated non-magnetic
> structure'
>     Laue-Multipole      'Laue experiment modelled with multipoles'
>     Twinned-Magnetic    'Twinned magnetic single crystal structure'
>     Twinned-Modulated   'Twinned modulated single crystal structure'
>     Laue-Twinned        'Laue experiment on twinned single crystal'
>     Laue-Twinned-Modulated 'Laue experiment on twinned modulated
> structure'
>     Custom              'Examine dictionaries provided in
> _audit_conform'
>     Local               'Locally modified dictionaries. These
> datafiles should not be distributed'
> _enumeration.default    Base
> save_
> ```
> 
> ## _audit.formalism_version
> 
> ```
> save_audit.formalism_version
> 
> _definition.id [2]       '_audit.formalism_version'
> _name.category_id    audit
> _name.object_id      formalism_version
> _description.text
> ;
> 
>      The version of the given formalism (see `_audit.formalism`). The
> version
>      number of a formalism is incremented when new model parameters
> are
>      added that do not significantly affect the model values in
> typical cases.
> 
> ;
> _type.contents          Text
> _type.purpose           State
> _type.container         Single
> _type.source            Assigned
> _enumeration.default    1.0
> save_
> ```
> 
> ## _dictionary.formalism
> 
> ```
> save_dictionary.formalism
> 
>     _definition.id [2]               '_dictionary.formalism'
>     _definition.class            Attribute
>     _definition.update           2016-12-17
>     _description.text
> ;
> 
>      The value of this attribute is associated with the set of
>      dataname meanings contained in this dictionary.
> 
> ;
>     _name.category_id            dictionary
>     _name.object_id              formalism
>     _type.purpose                Audit
>     _type.source                 Assigned
>     _type.container              Single
>     _type.contents               Text
> 
> save_
> ```
> 
> # Appendix II: a full hybrid dictionary
> 
> A complete dictionary using the above mechanisms is presented below.
> 
> ```
> #\#CIF_2.0
> ####################################################
> #                                                  #
> #    Dictionary for modulated powder diffraction   #
> #                                                  #
> ####################################################
> 
> data_MODPOW
> 
> _dictionary.title             MODPOW
> _dictionary.formalism         Powder-Modulated
> _dictionary.class             Instance
> _dictionary.version           1.0
> _dictionary.date              2016-12-19
> _dictionary.ddl_conformance   3.12
> _dictionary.namespace         MODPOW
> _description.text
> ;
> 
>     The modulated powder diffraction dictionary redefines datanames
>     for use when presenting the results of a powder diffraction
>     experiment using a modulated structure model.  The remainder of
> the
>     relevant definitions are found in the modulated structures
>     dictionary and the powder diffraction dictionary.
> 
> ;
> 
> save_MODPOW_GROUP
> 
>     _definition.id [2]       MODPOW_GROUP
>     _definition.scope    Category
>     _definition.class    Head
>     _definition.update   2016-12-19
>     _description.text
> ;
>     This category is the parent category for all definitions
>     in the MODPOW dictionary
> ;
> 
>     _name.category_id     MODPOW
>     _name.object_id       MODPOW_GROUP
> 
>     # The following import reads in and reparents all powder and
>     # modulated structure definitions to the MODPOW_GROUP category. As
> cif_ms is
>     # read second, the refln category will have the extra modulation
> indices
>     # defined.
> 
>     _import.get           [{"file":"cif_pow.dic" "save":"PD_GROUP"
> "mode":"Full"}
>                            {"file":"cif_ms.dic"  "save":"MS_GROUP"
> "mode":"Full"}]
> 
> save_
> 
> save__refln.F_complex
> 
> _definition.id [2]                          '_refln.F_complex'
> loop_
>   _alias.definition_id
>          '_refln.F_complex'
>          '_refln_F_complex'
> _definition.update                      2016-12-19
> _description.text
> ;
>      The structure factor vector for the reflection calculated from
>      the modulated structure given in the datablock identified by
>      _refln.phase_id
> ;
> _name.category_id                       refln
> _name.object_id                         F_complex
> _type.purpose                           Measurand
> _type.source                            Derived
> _type.container                         Single
> _type.contents                          Complex
> _enumeration.default                    0.
> 
> #
> #  A complete dREL expression for F_complex can be provided here,
> using all of the
> #  parameters provided in the powder and modulated structure
> dictionaries.
> #
> save_
> 
> save__refln.F_meas
> 
> _definition.id [2]                          '_refln.F_meas'
> loop_
>   _alias.definition_id
>          '_refln.F_meas'
>          '_refln_F_meas'
> _definition.update                      2016-12-19
> _description.text
> ;
>      The structure factor amplitude for the modulated reflection based
> on
>      partitioning of each observed powder diffraction intensity
> between
>      contributing reflections in proportion to the model reflection
> contributions.
> ;
> _name.category_id                       refln
> _name.object_id                         F_meas
> _type.purpose                           Measurand
> _type.source                            Derived
> _type.container                         Single
> _type.contents                          Real
> _enumeration.default                    0.
> #
> # A complete dREL expression for calculating F_meas from an observed
> powder diffractogram can be given here.
> #
> save_
> 
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> 
> Links:
> ------
> [1] http://changing_meanings_discussion_paper.md
> [2] http://definition.id
> _______________________________________________
> comcifs mailing list
> comcifs@iucr.org
> http://mailman.iucr.org/cgi-bin/mailman/listinfo/comcifs


Reply to: [list | sender only]