[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Standard uncertainties (SU) in the DDLm dictionary

Subject: Re: Standard uncertainties (SU) in the DDLm dictionary
From: Antanas Vaitkus <antanas.vaitkus90@xxxxxxxxx>
Date: Thu, 13 Apr 2017 17:30:15 +0300
DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:in-reply-to:references:from:date:message-id:subject:to;bh=TTuThTtQedCeEioNO+JWbfL5yqJpLe0pHC0xYGItNiA=;b=UgOcEzERqSy73NZDm2ciKTj2JzYGJJodcuJp+gHz33X4yG4uRTvqonafv6c31m0tmovCDVWILwirdrrQgOmMXsxG6+R752qK9EEZb7C9+GX0Zc0olZDiccbokCnaYNzclytwD/T/MD3N4N6f4pKR3815Rr4hueuxUhXVHHdlIST0AF3EeYFWFDYQk21sWPuNeNl8oET1H4eXsK46/KBgjNcSl/ZlusUxAQDDwm9ZRAE6AgCqcKaTYv6A4MXbta0Xf/G+6dk8kYjaOPWkdbUwKf7DSdY9u/9G3WaVqu8zrM6wHda75Krprp5ua3h2Ad0vQKS27HQAoUNrohfpe46VAg==
In-Reply-To: <MWHPR04MB0512B3CE39D7759AD6A3BC66E0020@MWHPR04MB0512.namprd04.prod.outlook.com>
References: <CALHYoX7EA0gVURSG1t2gELOKyFSd5+oSVtbSpi8jSaV3ekCpPw@mail.gmail.com><DM5PR04MB0508295878FCDE6EE8EA7829E0030@DM5PR04MB0508.namprd04.prod.outlook.com><239ae86a-92fc-17eb-0421-8917eea0e058@gmail.com><MWHPR04MB0512B3CE39D7759AD6A3BC66E0020@MWHPR04MB0512.namprd04.prod.outlook.com>

Dear John,

the lack of an explicit way to specify the "()" notation is indeed an inconvenience -- maybe COMCIFS would consider adding a mechanism for that to the DDLm dictionary? Anyway, thank you for clearing things up.

2017-04-13 16:39 GMT+03:00 Bollinger, John C <John.Bollinger@stjude.org>:

Hello Antanas,

Yes, a CIF dictionary must be interpreted with respect to the DDL in which it is written. Therefore, yes, for a data file to be valid with respect to a given dictionary, values therein for items defined in DDLm as having _type.purpose "Measurand" must be accompanied by SUs, expressed in one of the permitted ways.

But dictionaries defining such items do not have to permit both mechanisms. Not only do they not need to define a corresponding *_su item, they *also*, separately, do not need to allow values to take the parenthesized-SU form. Technically, they could define items that make no allowance for either mechanism; in such a case, the item definition could be valid with respect to DDLm, yet no data file providing values for such an item would be valid with respect to the dictionary, there being no way to provide the required SUs.

It is perhaps a bit inconvenient that DDLm does not have an explicit attribute or enumerated code indicating whether an item's value accommodates a parenthesized SU, but that lack does not mean that such a rule cannot be formally expressed. DDLm makes provision for item definitions to have associated validation methods, which should be entirely up to the task.

I anticipate that dictionaries written in DDLm will typically choose one of the two options for expressing SUs, use it for all measurands, and actively avoid making any allowance for the other alternative. It is the intention that DDLm should be strictly more powerful than both DDL1 and DDL2, capable of defining dictionaries that are fully equivalent to each of the existing DDL1 and DDL2 dictionaries. That includes describing mmCIF items that do not accept SUs expressed in the parenthesized form. It may be that the current form of DDLm is not quite there, but the area you're asking about is covered (however well or poorly).

Regards,

John

-----Original Message-----
From: cif-developers [mailto:cif-developers-bounces@iucr.org] On Behalf Of Antanas Vaitkus
Sent: Thursday, April 13, 2017 5:33 AM
To: cif-developers@iucr.org
Subject: Re: Standard uncertainties (SU) in the DDLm dictionary

Hello John,

thank you for the clarification. I do understand that the DDLs are used to write dictionaries, but doesn't this in turn also implies that they explain the way these dictionaries should be interpreted while validating CIF files?

On 04/12/2017 04:12 PM, Bollinger, John C wrote:

> This particular freedom accommodates the variety of current practices.
> DDL2 dictionaries, especially mmCIF, define separate data names for
standard
> uncertainties, and documents must use that mechanism to convey them in
order
> to be valid with respect to those dictionaries. DDL1 dictionaries
such as Core CIF,
> on the other hand, generally do not define separate data names for
standard uncertainties,
> and documents must therefore use the parenthesized form in order to be
valid with respect
> to *those* dictionaries. The DDLm versions of the Core and mmCIF
dictionaries do not afford
> any different options to data files than the original DDL1 and DDL2 dictionaries do.

I fully understand the reasoning behind allowing both of the options.
However, in DDL1 and DDL2 the distinction between the parentheses "()"
notation and the separate "*_su" data item was made clear:
1) both DDL1 and DDL2 had a way of specifying that the appended
parentheses are allowed (using the "esd/su" as the value of the
_type_conditions and _item_type_conditions.code data items
respectively). That did not affect the existence of "*_su" data
items in any way;
2) DDL2 had the '_item_related.function_code' data item which allowed
one to specify which data item should hold the standard uncertainty.
This in turn did not affect the parentheses "()" notation in any way.

Following this logic, formally the mmCIF dictionary v2.0.09 allows both the "()" and the "*_su" notations to specify the standard uncertainty for multiple data items. For example, the _cell.length_a data item has both the "esd" property and a separate "*_su" data item:

### Example 1 begin ###
save__cell.length_a
_item_description.description
; Unit-cell length a corresponding to the structure reported in
angstroms.
;
_item.name '_cell.length_a'
_item.category_id cell
_item.mandatory_code no
_item_aliases.alias_name '_cell_length_a'
_item_aliases.dictionary cif_core.dic
_item_aliases.version 2.0.1
loop_
_item_dependent.dependent_name
'_cell.length_b'
'_cell.length_c'
loop_
_item_range.maximum
_item_range.minimum . 0.0
0.0 0.0
_item_related.related_name '_cell.length_a_esd'
_item_related.function_code associated_esd
_item_sub_category.id cell_length
_item_type.code float
_item_type_conditions.code esd
_item_units.code angstroms
save_

save__cell.length_a_esd
_item_description.description
; The standard uncertainty (estimated standard deviation)
of _cell.length_a.
;
_item.name '_cell.length_a_esd'
_item.category_id cell
_item.mandatory_code no
# _item_default.value 0.0
loop_
_item_dependent.dependent_name
'_cell.length_b_esd'
'_cell.length_c_esd'
_item_related.related_name '_cell.length_a'
_item_related.function_code associated_value
_item_sub_category.id cell_length_esd
_item_type.code float
_item_units.code angstroms
save_
### Example 1 end ###

The DDLm has the 2) mechanism, but seems to lack the 1). As a result, there seems to be no explicit way to allow (or disallow) the parenthesis notation in DDLm. Should it be assumed that these two ways of specifying standard uncertainty are mutually exclusive for any given data item -- that is, if a separate data item is defined in the dictionary is the "()" notation then disallowed? The question is about how the
*dictionary* file should be interpreted for the CIF validation purposes.

For example, how should this excerpt from the DDLm cif_core.dic

(https://na01.safelinks.protection.outlook.com/?url=https%3A%2F%2Fgithub.com%2FCOMCIFS%2Fcif_core&data=01%7C01%7CJohn.Bollinger%40stjude.org%7C535e375a534340a195c208d482587a9d%7C22340fa892264871b677d3b3e377af72%7C0&sdata=BnZZC4SLZuPRzrm3UyhGS66EkzOJpfr84vufOfUDW1o%3D&reserved=0) should be interpreted:

### Example 2 begin ####
save__cell.length_a

_definition.id '_cell.length_a'
loop_
_alias.definition_id
'_cell_length_a'
'_cell.length_a'
_import.get [{'save':cell_length 'file':templ_attr.cif}]
_name.category_id cell
_name.object_id length_a

save_

save__cell.length_a_su

_definition.id '_cell.length_a_su'
loop_
_alias.definition_id
'_cell_length_a_su'
'_cell.length_a_esd'
_import.get [{'save':cell_length_su 'file':templ_attr.cif}]
_name.category_id cell
_name.object_id length_a_su
_name.linked_item_id '_cell.length_a'

save_

# excerpt from the templ_attr.cif
save_cell_length

_definition.update 2014-06-08
_description.text
;
The length of each cell axis.
;
_type.purpose Measurand
_type.source Recorded
_type.container Single
_type.contents Real
_enumeration.range 1.:
_units.code angstroms
save_
### Example 2 end ###

a) Would a CIF with the following value be valid according to this dictionary?

### Example 2.1 begin ####
_cell.length_a 13.3(2)
### Example 2.1 end ###

b) And how about a CIF file with the values?

### Example 2.2 begin ###
_cell.length_a 13.3(2)
_cell.length_a_su 0.3 # the su mismatch is on purpose ### Example 2.2 end ###

Also, is it correct to assume that the policy toward specifying the SU is stricter in DDLm? Previous DDLs specified that the number *might* be accompanied by its SU and the word *must* is used in this context in the DDLm (as in "This value must be accompanied by its standard uncertainty").

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Fmailman.iucr.org%2Fcgi-bin%2Fmailman%2Flistinfo%2Fcif-developers&data=01%7C01%7CJohn.Bollinger%40stjude.org%7C535e375a534340a195c208d482587a9d%7C22340fa892264871b677d3b3e377af72%7C0&sdata=VoThTnjcpvD%2BXCKbVx71AFUK28d2o%2BQAuHklrP4xODo%3D&reserved=0

________________________________

Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

--

Antanas Vaitkus,

PhD student at Vilnius University Institute of Biotechnology,
room V325, Saulėtekio al. 7,
LT-10257 Vilnius, Lithuania

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

Reply to: [list | sender only]

References:

Standard uncertainties (SU) in the DDLm dictionary (Antanas Vaitkus)

RE: Standard uncertainties (SU) in the DDLm dictionary (Bollinger, John C)

Re: Standard uncertainties (SU) in the DDLm dictionary (Antanas Vaitkus)

RE: Standard uncertainties (SU) in the DDLm dictionary (Bollinger, John C)

Prev by Date: RE: Standard uncertainties (SU) in the DDLm dictionary

Next by Date: Re: Draft JSON specification for CIF

Prev by thread: RE: Standard uncertainties (SU) in the DDLm dictionary

Next by thread: Draft JSON specification for CIF

Index(es):

Date

Thread

Discussion List Archives

Re: Standard uncertainties (SU) in the DDLm dictionary