Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Standard uncertainties (SU) in the DDLm dictionary

Hello Antanas,

Yes, a CIF dictionary must be interpreted with respect to the DDL in which it is written.   Therefore, yes, for a data file to be valid with respect to a given dictionary, values therein for items defined in DDLm as having _type.purpose "Measurand" must be accompanied by SUs, expressed in one of the permitted ways.

But dictionaries defining such items do not have to permit both mechanisms.  Not only do they not need to define a corresponding *_su item, they *also*, separately, do not need to allow values to take the parenthesized-SU form.  Technically, they could define items that make no allowance for either mechanism; in such a case, the item definition could be valid with respect to DDLm, yet no data file providing values for such an item would be valid with respect to the dictionary, there being no way to provide the required SUs.

It is perhaps a bit inconvenient that DDLm does not have an explicit attribute or enumerated code indicating whether an item's value accommodates a parenthesized SU, but that lack does not mean that such a rule cannot be formally expressed.  DDLm makes provision for item definitions to have associated validation methods, which should be entirely up to the task.

I anticipate that dictionaries written in DDLm will typically choose one of the two options for expressing SUs, use it for all measurands, and actively avoid making any allowance for the other alternative.  It is the intention that DDLm should be strictly more powerful than both DDL1 and DDL2, capable of defining dictionaries that are fully equivalent to each of the existing DDL1 and DDL2 dictionaries.  That includes describing mmCIF items that do not accept SUs expressed in the parenthesized form.  It may be that the current form of DDLm is not quite there, but the area you're asking about is covered (however well or poorly).


Regards,

John

-----Original Message-----
From: cif-developers [mailto:cif-developers-bounces@iucr.org] On Behalf Of Antanas Vaitkus
Sent: Thursday, April 13, 2017 5:33 AM
To: cif-developers@iucr.org
Subject: Re: Standard uncertainties (SU) in the DDLm dictionary

Hello John,

thank you for the clarification. I do understand that the DDLs are used to write dictionaries, but doesn't this in turn also implies that they explain the way these dictionaries should be interpreted while validating CIF files?

On 04/12/2017 04:12 PM, Bollinger, John C wrote:

> This particular freedom accommodates the variety of current practices.
> DDL2 dictionaries, especially mmCIF, define separate data names for
standard
> uncertainties, and documents must use that mechanism to convey them in
order
> to be valid with respect to those dictionaries.  DDL1 dictionaries
such as Core CIF,
> on the other hand, generally do not define separate data names for
standard uncertainties,
> and documents must therefore use the parenthesized form in order to be
valid with respect
> to *those* dictionaries.  The DDLm versions of the Core and mmCIF
dictionaries do not afford
> any different options to data files than the original DDL1 and DDL2 dictionaries do.

I fully understand the reasoning behind allowing both of the options.
However, in DDL1 and DDL2 the distinction between the parentheses "()"
notation and the separate "*_su" data item was made clear:
1) both DDL1 and DDL2 had a way of specifying that the appended
   parentheses are allowed (using the "esd/su" as the value of the
   _type_conditions and _item_type_conditions.code data items
   respectively). That did not affect the existence of "*_su" data
   items in any way;
2) DDL2 had the '_item_related.function_code' data item which allowed
   one to specify which data item should hold the standard uncertainty.
   This in turn did not affect the parentheses "()" notation in any way.

Following this logic, formally the mmCIF dictionary v2.0.09 allows both the "()" and the "*_su" notations to specify the standard uncertainty for multiple data items. For example, the _cell.length_a data item has both the "esd" property and a separate "*_su" data item:

### Example 1 begin ###
save__cell.length_a
    _item_description.description
;              Unit-cell length a corresponding to the structure reported in
              angstroms.
;
    _item.name                  '_cell.length_a'
    _item.category_id             cell
    _item.mandatory_code          no
    _item_aliases.alias_name    '_cell_length_a'
    _item_aliases.dictionary      cif_core.dic
    _item_aliases.version         2.0.1
    loop_
    _item_dependent.dependent_name
                                '_cell.length_b'
                                '_cell.length_c'
    loop_
    _item_range.maximum
    _item_range.minimum            .    0.0
                                  0.0   0.0
    _item_related.related_name  '_cell.length_a_esd'
    _item_related.function_code   associated_esd
    _item_sub_category.id         cell_length
    _item_type.code               float
    _item_type_conditions.code    esd
    _item_units.code              angstroms
     save_

save__cell.length_a_esd
    _item_description.description
;              The standard uncertainty (estimated standard deviation)
               of _cell.length_a.
;
    _item.name                  '_cell.length_a_esd'
    _item.category_id             cell
    _item.mandatory_code          no
#    _item_default.value           0.0
    loop_
    _item_dependent.dependent_name
                                '_cell.length_b_esd'
                                '_cell.length_c_esd'
    _item_related.related_name  '_cell.length_a'
    _item_related.function_code   associated_value
    _item_sub_category.id         cell_length_esd
    _item_type.code               float
    _item_units.code              angstroms
     save_
### Example 1 end ###

The DDLm has the 2) mechanism, but seems to lack the 1). As a result, there seems to be no explicit way to allow (or disallow) the parenthesis notation in DDLm. Should it be assumed that these two ways of specifying standard uncertainty are mutually exclusive for any given data item -- that is, if a separate data item is defined in the dictionary is the "()" notation then disallowed? The question is about how the
*dictionary* file should be interpreted for the CIF validation purposes.

For example, how should this excerpt from the DDLm cif_core.dic
(https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCOMCIFS%2Fcif_core&data=01%7C01%7CJohn.Bollinger%40stjude.org%7C535e375a534340a195c208d482587a9d%7C22340fa892264871b677d3b3e377af72%7C0&sdata=BnZZC4SLZuPRzrm3UyhGS66EkzOJpfr84vufOfUDW1o%3D&reserved=0) should be interpreted:

### Example 2 begin ####
save__cell.length_a

_definition.id                          '_cell.length_a'
loop_
  _alias.definition_id
         '_cell_length_a'
         '_cell.length_a'
_import.get [{'save':cell_length  'file':templ_attr.cif}]
_name.category_id                       cell
_name.object_id                         length_a

save_


save__cell.length_a_su

_definition.id                          '_cell.length_a_su'
loop_
  _alias.definition_id
         '_cell_length_a_su'
         '_cell.length_a_esd'
_import.get [{'save':cell_length_su  'file':templ_attr.cif}]
_name.category_id                       cell
_name.object_id                         length_a_su
_name.linked_item_id                    '_cell.length_a'

save_

# excerpt from the templ_attr.cif
save_cell_length

    _definition.update           2014-06-08
    _description.text
;
     The length of each cell axis.
;
    _type.purpose                Measurand
    _type.source                 Recorded
    _type.container              Single
    _type.contents               Real
    _enumeration.range           1.:
    _units.code                  angstroms
     save_
### Example 2 end ###

a) Would a CIF with the following value be valid according to this dictionary?

### Example 2.1 begin ####
_cell.length_a 13.3(2)
### Example 2.1 end ###

b) And how about a CIF file with the values?

### Example 2.2 begin ###
_cell.length_a 13.3(2)
_cell.length_a_su 0.3 # the su mismatch is on purpose ### Example 2.2 end ###

Also, is it correct to assume that the policy toward specifying the SU is stricter in DDLm? Previous DDLs specified that the number *might* be accompanied by its SU and the word *must* is used in this context in the DDLm (as in "This value must be accompanied by its standard uncertainty").
_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.iucr.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcif-developers&data=01%7C01%7CJohn.Bollinger%40stjude.org%7C535e375a534340a195c208d482587a9d%7C22340fa892264871b677d3b3e377af72%7C0&sdata=VoThTnjcpvD%2BXCKbVx71AFUK28d2o%2BQAuHklrP4xODo%3D&reserved=0

________________________________

Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________cif-developers mailing listcif-developers@iucr.orghttp://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.