[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Clarification of SU in DDLm dictionaries
- To: ddlm-group <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Clarification of SU in DDLm dictionaries
- From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
- Date: Mon, 25 Jan 2021 16:09:05 +0000
- Accept-Language: en-US
- ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=passsmtp.mailfrom=stjude.org; dmarc=pass action=none header.from=stjude.org;dkim=pass header.d=stjude.org; arc=none
- ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-SenderADCheck;bh=AggN69m6wT2nfROQ9GAffQoMoQs3m4nyqrHND+QHUWE=;b=LCqqv8ZUJeJOFSXVhpm5H7itUG//CBgcYfDfvYRAIEm1c1e+oHV2JCwTRUDJkKzsaCSoRxzhoS2kIIPLnydzru5daEMKs1FKNJr3aMMEObbmuENil1JABiXSs+5sTPjrcfXSnAaTwosoxPGNH2ebzShDud2R+DbG8k7MrqmUzva3a8tx3obrjA87JaB62dofJ25GHrsE5iQbGgzQVzpl/ZFoyRWdE7+vYgaZYyxQxHD5IF8cepxywA6jfPbm1LKzra7dfchMMmT5jPJcsiOYZXYyr+aCRs6Nxlis35xXlg8ByfLH8Q8j84uWwlMRRpClxFHR76sYaXpuhlmD5DXUeA==
- ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;b=WwkZRnh/XvD7rd9eCcr8CLUwU5+FSkXpR7vXTCw3+pvrCiQsO5KhAYUyOvpAl87qnx7YLJbfl52aVPzonY7D1Kc1n8qug+utQIrVnazc/rEYr8gTmPzXSNB/4Ucbpyu0uPWwtjWJA5fcZy7FKSfKRaoW3/HVNCm3ot1XGuGxUyCeD8QtYRh4Lu+T6rYGyYnW+zDTO2t0lemt1fklyjDcA0M6U7S+tlKLubaqL4zgjj8wP4URTiHdt795CZuYSpEVkp7HOYj3C5QgYqnupIT/c+KHnUcNaHz90sVGAiHbjyG6a3kyjvGchQWQwMuB0z7AJvBng8JDCS2+jWHanFm2Rw==
- In-Reply-To: <CAM+dB2eN3WyVwimt0uY0LyXTdyV5U776QQO058R_wsBobXPKmw@mail.gmail.com>
- IronPort-SDR: 677+kl/YW4REtvy+xDF2nBEkOkU345MQFmKYf/sCLly+kEokqxnkvDk97paYuZ0X2lPBdq2KhvnxCWjoWOM1S4EFf6dEo+gqK84FcB+VutscjeyOTUH8Cf4VX55v7Cf7eij93t/r/i/+AvVXMkRlc+SYFOFvIMnAmsmRDsCIiMTXwon/ZRaYmC+YYzO9liNKwHJ+jJBoZdhXoSRLebcCscFJ3elYTo8W0AvQIM+6v9WRHopojtVVn4Hc/6RrRdwt8U4mo3MihQ61YPgMf3wA8TJMDYaG1iwQKqDCfMnCQV8=
- References: <CAM+dB2eN3WyVwimt0uY0LyXTdyV5U776QQO058R_wsBobXPKmw@mail.gmail.com>
Dear DDLm Group,
I am satisfied with items (1) and (2) of the proposal. I am less satisfied with some details of item (3) and therefore with item (4).
I appreciate and accept that one of the objectives is to achieve independence of DDLm and of our data dictionaries from the format of data files. On the other hand, I would also prefer to avoid loading the CIF serialization format, or others, with implicit
dictionary-related semantics. Thus, if the value presented for a measurand item is permitted to additionally convey a value for that item's associated SU item -- and we need that to support a large volume of existing data -- then that should be expressed
by the measurand's definition, presumably as an aspect of its data type.
Thus, I suggest that the DDLm definition of measurands be adjusted to be generic with respect to the details of how a measurand value that conveys its own SU is represented, without removing the explicit provision for measurand and su being presented in combined
form. In conjunction with that, the chapter on CIF file syntax should mention the CIF convention for this. The last bit may be a little tricky from the perspective of separating convention from rule, but as one of the more ardent supporters of maintaining
that separation, I think a satisfactory compromise can be reached in that area.
The revised DDLm definition for 'Measurand' might then be more like this:
```
Used to type an item with a numerically estimated value
that has been recorded by measurement or derivation. A data name definition for the standard uncertainty (SU) of this item must be provided in a separate definition with
`_type.purpose` of `SU`. The value of a measurand item
must be accompanied by a value of its associated SU
item, expressed either:1) integrated with the measurand value in a manner
characteristic of the data format,
or
2) as a separate, explicit value for the associated SU item.
These alternatives are semantically equivalent.
```
That formulation is intended to have the same properties as James's with respect to the questions he presented. I think the entire discussion he presented applies to this variation on the definition, too. The most important part to me is the last sentence,
which I intend as a foundation for a more explicit specification, as described above, of the CIF syntax details that James's alternative would leave implicit.
Regards,
John
--
John C. Bollinger, Ph.D., RHCSA
Computing and X-ray Scientist
Department of Structural Biology
St. Jude Children's Research Hospital
From: ddlm-group <ddlm-group-bounces@iucr.org> on behalf of James Hester <jamesrhester@gmail.com>
Sent: Sunday, January 24, 2021 8:04 PM To: ddlm-group <ddlm-group@iucr.org> Subject: [ddlm-group] Clarification of SU in DDLm dictionaries
Caution: External Sender. Do not open unless you know the content is safe.
Dear DDLm-group,
A careful reviewer of the DDLm volume G chapter has noted issues with the way in which we treat standard uncertainties. I have created a draft proposal for discussion at https://github.com/COMCIFS/comcifs.github.io/blob/master/draft/su_discussion.md
(reproduced below). Please provide your comments and once we have come to a resolution I will pass the final result on to COMCIFS for confirmation.
thanks,
James.
================================================================
# Proposal: treatment of SU in DDLM dictionaries
## Introduction There is some residual ambiguity around the treatment of su in our DDLm dictionaries. Currently, if `_type.purpose` for a data name is `Measurand`, the DDLm attribute dictionary states: ``` Used to type an item with a numerically estimated value that has been recorded by measurement or derivation. This value must be accompanied by its standard uncertainty (SU) value, expressed either as: 1) appended integers, in parentheses (), at the precision of the trailing digits, or 2) a separately defined item with the same name as the measurand item but with an additional suffix '_su'. ``` This raises the following issues: 1. Option (1) presupposes CIF format. DDLm should be agnostic regarding format 2. Should the `_su` form of the data name be explicitly defined in the dictionary? 3. Is it legal to provide both the `_su` form and the parenthetical form for a data name? 4. Does the value of a `Measurand` data name for the purpose of dREL include the SU? 5. Can the `_su` suffix be a requirement when the current DDLm dictionaries contain data names that do not follow this? (e.g. `_refln.F_sigma`). The following proposal aims to clarify these questions. ## Proposal 1. That all `Measurand` data names have a corresponding data name for their SU explicitly defined; 2. That the convention for IUCr dictionaries is that this data name is formed by adding `_su` to the original data name; 3. That the parenthetical form of presentation of the su value for CIF syntax is understood as a shorthand assignment of this su value to the associated SU dataname; 4. That the definition for `Measurand` is therefore rewritten as: ``` Used to type an item with a numerically estimated value that has been recorded by measurement or derivation. A data name definition for the standard uncertainty (SU) of this item must be provided in a separate definition with `_type.purpose` of `SU`. ``` The above questions are then answered as follows: 1. The new definition is format-agnostic 2. Yes, `_su` forms should be defined in the dictionary. Using `_su` as a suffix is purely an IUCr convention which is not always followed (e.g. `_refln.F_sigma`) and therefore not appropriate for the DDLm attribute dictionary to specify. 3. Yes, it is *syntactically* legal to have both forms, as the CIF syntax can have no embedded understanding of the meaning of the data names, including `*_su` data names, and therefore duplication cannot be detected as a syntax error. It is instead a semantic error in the same way as a cell volume - cell parameter mismatch would be. Thus if the two values provided agree, there is no error, and if they disagree, the software can take steps based on the importance of the mismatch to the particular computation. 4. No, the value of a `Measurand` data name includes the main value only. ## Discussion In order for DDLm to be format-agnostic, each format needs to associate some location in that format with a data name. The appearance of a value in a CIF file without the data name appearing as well (as is being proposed above) is thus not unusual in general, simply for CIF this association is usually transparent due to the data name appearing in the format itself. ### Compatibility #### CIF authoring software Authoring software remains free to append SU in parentheses. #### CIF reading software Legacy CIF reading software will have the same problems that it presumably has with the new 'dotted' data names, in the sense that a data name that was unknown at the time of software preparation has been used to provide a value. This is a cost that we have accepted. ### Other comments The su of a data item must always have been treated separately in software, as software must handle the su differently to the main value due at least to the differences in the way errors are propagated. The creation of a separate data name captures this fact. T +61 (02) 9717 9907
F +61 (02) 9717 3145 M +61 (04) 0249 4148 Email Disclaimer: www.stjude.org/emaildisclaimer Consultation Disclaimer: www.stjude.org/consultationdisclaimer |
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Clarification of SU in DDLm dictionaries (James Hester)
- References:
- [ddlm-group] Clarification of SU in DDLm dictionaries (James Hester)
- Prev by Date: [ddlm-group] Clarification of SU in DDLm dictionaries
- Next by Date: Re: [ddlm-group] Clarification of SU in DDLm dictionaries
- Prev by thread: [ddlm-group] Clarification of SU in DDLm dictionaries
- Next by thread: Re: [ddlm-group] Clarification of SU in DDLm dictionaries
- Index(es):