Re: [ddlm-group] Multi block principles
- To: "firstname.lastname@example.org" <email@example.com>, "Group finalisingDDLm and associated dictionaries" <firstname.lastname@example.org>
- Subject: Re: [ddlm-group] Multi block principles
- From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
- Date: Fri, 19 Nov 2021 16:16:59 +0000
- Accept-Language: en-US
- ARC-Authentication-Results: i=1; mx.microsoft.com 1; spf=passsmtp.mailfrom=stjude.org; dmarc=pass action=none header.from=stjude.org;dkim=pass header.d=stjude.org; arc=none
- ARC-Message-Signature: i=1; a=rsa-sha256; c=relaxed/relaxed; d=microsoft.com; s=arcselector9901;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version:X-MS-Exchange-AntiSpam-MessageData-ChunkCount:X-MS-Exchange-AntiSpam-MessageData-0:X-MS-Exchange-AntiSpam-MessageData-1;bh=hAT+f77atKjRhwpV5QV2gOt5eVq9e6kl/WcMZ5IDh8I=;b=mcgboh8s+TwRKVGuRch8fHgcBqG0GyVCP2cSmKq4PkjZxlvK1nxdpLPGs/og16VlVNNChKQZJysR/4B9uRKbrJVThDMFoaSSqMdWYI4H+d6zVdDaxHZy7ffCdJXyDSOSzwsrfJ+bUjx/Qu3kW/ut7F3tRxjdNEQjAYKw2nzWCQF85Q/m35NsdqjPRJ0lGAxlRagtWXGnKKSOgV7I8fr+OcwwSvBQEMOGKLwp874PHT8H+QztF+snmRF2IbZ2nPzTBuDAkVu+pTH0u1r68ZoL+rHj1oqFezgV6nzrldl3ZFNdmWAnl3jTBa+RtC59VXqUH9CfEhFN9dQlvU5QVWgZUA==
- ARC-Seal: i=1; a=rsa-sha256; s=arcselector9901; d=microsoft.com; cv=none;b=gG4dvHMd+r7QXdkFPckb1npEHaJzWsYy3+/t10YvUD0Y8J2qcFlAxG9y8maXNVmYSgQLRTR4G3T0/Yl4xA/y7Zo/jR+eLT5xR4RPU8wdzsOEQCYzB0EhnMgIWg+RjrT6CoxAZrWfQWAHWjVDg+dnQFInydPWdoJNUund5esBn7mS+C5W67AgB3o/qSO1v+a+9p6et2+rX2b6/4mFG77PQF6zGI4TQ30xU88KYkalfAJ5qPP3AkFK7snYY9GH+SCwQIBTKxqiZ0d6cK70tyuXpKHC1DKg+VkDI7Ks7IoB6IWKorSHszmw89fOoyC5bvRjmQ5Xypl+v6vR5USPknE+sw==
- In-Reply-To: <CAM+dB2frS4Xg7fhxy5GQcw5t0WJ+pvia-HvqwLAzC-ySZoU+QQ@mail.gmail.com>
- IronPort-SDR: xFqogj9M72RI54zkynvTJ6/Gwexe5rwlGrZ4AbvduTOUNF5wyTpWsZDUEu3qURuB9S2ToBJlkieUmhlM46D56l8SRmR3Zv0E2ah1TRE4tQLzd0s6Jy5w4MHn2VFe/8JsYb5n3inT61hhhlOo3+xd8E0VA7zWfDSlmfkho6fT4k/JK08dHodaJPRY7F1NNjVZmehmHDNKQRdAjMPRLgGiwBSkxy+exDJkUrlXmKUBnYxv5CWouJJpkIOt138uuaHyzRZ8v1DBnCbammq9T1mB7z/c0F4qAzJ5gleOl+VuSUM=
- References: <CAM+dB2fajH1c1vhrCJU9v-QQw0kt4Y2udDEx4HBK9QzDq=LD3w@mail.gmail.com><CH2PR04MB6950E54AF550C819FF598F35E0999@CH2PR04MB6950.namprd04.prod.outlook.com><CAM+dB2frS4Xg7fhxy5GQcw5t0WJ+pvia-HvqwLAzC-ySZoU+QQ@mail.gmail.com>
Dear DDLm group,
Comments inline below.
On Thursday, November 18, 2021 8:41 PM, James Hester wrote:
On Wed, 17 Nov 2021 at 03:41, Bollinger, John C <John.Bollinger@stjude.org> wrote:
I'm not sure I follow. If a powder diffraction experiment splits the structures of, say, 3 component phases over 3 blocks + 1 block for the invariable information, all conforming to 'Base', isn't that information identical to presenting it all in a single block (no longer conforming to 'Base') with appropriate key data names added?
Perhaps, then, it is me who does not follow. Let me try again.
I do not think it is the intention to allow CIF data to be split _arbitrarily_ among data blocks, but if so, then I do object to the “arbitrarily” part. Up to now, different CIF data items have been associated with each other at a basic level by appearing in the same data block. I understand that some communities may have layered additional conventions on top of that, but no such additions have been baked into CIF or any of our DDLs. I want to maintain the data block as a coherent unit of data, such that splitting this …
... into this …
... loses information (that the two items are associated with each other) except in cases that are explicitly provided for by dictionaries and conveyed by data explicitly presented among the relevant collection of data blocks.
The message that one can split data across different blocks should not be de-contextualized or oversold.
What is a zero-column key? Is that like an implicit key with no actual values stored?
I’m not sure it’s a concept that anyone else uses, but it’s a simple generalization of standard ideas:
A key for a relation consists of some subset of attributes of that relation (columns). No two distinct rows of the relation can match in every key attribute, so in the absence of other constraints, as many rows can be present as there are distinct combinations of values drawn from the key attributes’ domains.
Now suppose we want a relation that is restricted to a single row. One way to do that would be to give the relation an attribute whose domain contains only one value, and to designate that as the only key attribute. But in most cases that’s artificial and untidy.
There is a cleaner and simpler alternative: designate a key consisting of _zero_ attributes. There is only one distinct combination of zero values: the empty set / tuple / dictionary. Therefore, a zero-attribute (zero-column) key affords only one row. If we contemplate adding key attributes to a category, then adding them to an existing zero-attribute key is both logically and structurally simpler than converting a category that does not have a key at all into one that does.
For DDLm dictionaries, that concept could be applied to give category keys to Set categories without defining any new attributes in those categories. Where Set categories need to be changed, possibly dynamically, into Loop categories, that’s made simpler if the fundamental difference is quantitative (the number of key attributes) rather than qualitative.
So how about reversing it, and the child categories instead identify their parent categories using a new DDLm attribute?
Of course, we already have exactly this for Loop – Loop relationships, and we will need to use it for Loop – Set_turned_into_Loop relationships, at least conceptually. It takes the form of the _name.linked_item_id of an item with _type.purpose Link. I think it’s sensible to handle Loop – Set relationships analogously. It’s too late to design a single mechanism that could handle both, but that would have been ideal.
This would still require extension dictionaries to add information to core categories from time to time. One example might be an imaginary twinning dictionary that introduces 'twin_id' in category 'twin'. Until this dictionary, the 'refln' category implicitly assumed a single value of this identifier, so the dictionary would redefine 'refln' to also depend on 'twin', as would 'diffrn_refln' and some others. The test comes when e.g. the modulated structure dictionary does not know about the existence of the 'twin' dictionary and redefines 'refln' its own way.
This missing information is, however, not a problem as it simply retains the meaning of 'single individual twin' for a modulated structure. If someone wants to describe a twinned modulated structure, then the modulated structures dictionary categories can be updated accordingly, and as long as the 'Base' schema is retained legacy software will be OK. We still retain the option of explicitly defining the parent/child data names for complex situations.
I acknowledge that we can expect that extensions will still sometimes need to define modifications to core categories. I am ok with that in principle, and although it will require some care in practice, I think it is workable.
On reflection, the original extension dictionary mechanism (adding explicit key data names to child categories) was really just creating these dependencies in child categories, but at the cost of proliferation of extension dictionaries (e.g. a modulated-structure-twin dictionary, a modulated-structures-laue-twin dictionary etc.). It seems much neater to simply gradually expand the lists of "parent" categories in child categories within the single dictionary as the need arises. If we are agreeable with this approach I'll draft a definition for a new DDLm attribute that we can discuss.
As may be evident from my previous comments, I’m not sure I recognize a distinction between an original extension mechanism and a new one. At minimum, I guess I have not been viewing the multi-block proposal as a dictionary extension mechanism, though upon reflection, I see how it has a form of that rolled in. If you are satisfied that we have enough common ground to consider specifics of a DDLm attribute then I would be happy to have that conversation.
Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________ ddlm-group mailing list email@example.com http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Re: [ddlm-group] Multi block principles (James H)