Re: [ddlm-group] Second proposal to allow looping of 'Set'categories
- To: Group finalising DDLm and associated dictionaries <firstname.lastname@example.org>
- Subject: Re: [ddlm-group] Second proposal to allow looping of 'Set'categories
- From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
- Date: Fri, 10 Jun 2016 14:00:35 +0000
- Accept-Language: en-US
- authentication-results: spf=none (sender IP is )smtp.mailfrom=John.Bollinger@STJUDE.ORG;
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=SJCRH.onmicrosoft.com; s=selector1-stjude-org;h=From:Date:Subject:Message-ID:Content-Type:MIME-Version;bh=ftaWbbdv0zjo/drlXqXYWbvHutAPPmOLoGUqj65nr8E=;b=V8Q72fydp3Yxm8v9ekkPhhIrzcS992Guf2IsxG55D3M/pyVnpGOIAmp02YV6Gxq6NH0q1URv6MLF53IDpGoUkqcniG1fvXeylaVUB4QTlODFiRc70KfwbXPP3fhyWUosDyRXnaW8N8pGZOGsU+bi8CYY92E78N7jbVWVMvkynws=
- In-Reply-To: <CAM+dB2fTUYtNQNaFMGQFnNyqnAgmU4koexAu-ZsiKm5L+S7qBg@mail.gmail.com>
- References: <CAM+dB2fTUYtNQNaFMGQFnNyqnAgmU4koexAu-ZsiKm5L+S7qBg@mail.gmail.com>
- spamdiagnosticmetadata: NSPM
- spamdiagnosticoutput: 1:99
These distinguishing details of the James’s new proposal, "Proposal #2", stand out to me (comments interspersed):
() It depends on a new data name, which must be assumed to be well-known to all CIF processors, regardless of which dictionary, if any, actually contains its definition.
() The proposal gives COMCIFS (or our delegate) the responsibility to maintain a controlled vocabulary for values of the new data name.
() The value, if any, associated with the new data name modulates the definitions of other items appearing in the same data block (or save frame?).
() New data names representing category keys and child keys must be created in conjunction with maintaining the vocabulary for the new data name.
At first I thought the key idea there was that CIF data files that make use of loopability of Set categories should affirmatively declare that they are doing so, on a category-by-category basis. Perhaps that was indeed the intent, but CIF data files express that same thing more effectively and less redundantly by simply providing the looped data. Use of an additional item provides no advantage with respect to interpreting data files, and especially not with respect to existing software avoiding misinterpretation of new data files.
I later decided that the primary effect of requiring looped-Set usage to be explicitly declared would be to maintain central control over which Set categories can be presented as multi-packet loops. Leaving aside for the moment the question of whether that’s an appropriate objective, the proposal still assumes that definitions of the relevant parent and child keys will be created, and that provides the same measure of control by itself.
The only other purpose I have come up with for the proposed new item is to support cross validation. That is, given a CIF data file containing a multi-packet loop of items belonging to a Set category, one could consult the new item to confirm that the looped data were presented as such intentionally, with knowledge that the usage of the category is out of the ordinary. I can accept that as a rationale, but I find it pretty weak.
() The proposal retains the distinction between Set and Loop categories, while nevertheless allowing Set categories to be presented as multi-packet loops under some circumstances.
I think I understand why the proposal does this: it maintains a distinction between categories that ordinarily are not looped and those that ordinarily are looped. It also helps support the restrictions on which categories may be presented as multi-packet loops, as discussed above. I am not yet persuaded, however, that this approach should be preferred over simply making most or all categories defined by data dictionaries (as opposed to DDLm itself) be Loops. It also maintains a bias towards an ordinary / customary uses of items that may or may not actually be warranted – that’s what got us into this situation in the first place, after all.
() Permission to omit category keys of Set categories is expressed in prose, not machine-readable form.
This would by no means be the only aspect of CIF data definitions whose expression is not machine-readable, but if there were a way to express this aspect in machine readable form -- and I think there is -- then that would be preferable.
() The proposal has no particular provision for accommodating the implicit relationships between each Set category and every other category.
I’m talking here about the relationships that arise simply by virtue of categories being Sets -- all other items in the same container are at least potentially associated with every set that appears in the container. These relationships can be expressed in English in the form "The FOO appearing in the same data block". In effect, DDLm Sets are like global variables.
We rely on this all over the place -- for example the REFLNS (Set) and REFLN (Loop) categories rely on the DIFFRN (Set) category to provide the associated experimental details. If DIFFRN were looped, then both of these categories (and potentially many others) would need child keys, too.
Overall, any proposal that requires COMCIFS’s or a DMG’s intervention to enable new usages of existing data names, and that causes such changes to have global scope, as proposal #2 does, destabilizes CIF by increasing the frequency of disruptive changes. I think it would be better to find an alternative that solves the problem once for all. Adopting such an approach probably would mean relinquishing some of the control that the present proposal would afford us, but I think that’s an essential aspect of the problem space: the more control we exert over what data can be expressed, the more occasions will arise when we need to make changes to allow more or different data expressions.
It will be obvious by this point that I have significant reservations about proposal #2. Lest I seem relentlessly negative, I do have a general idea for an alternative. This e-mail is already more than long enough, however, so I will present that separately.
Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________ ddlm-group mailing list email@example.com http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Re: [ddlm-group] Second proposal to allow looping of 'Set'categories (James Hester)
- Re: [ddlm-group] Second proposal to allow looping of'Set' categories (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] Proposal to enhance the behaviour of a DDLm "Set"category: please consider
- Next by Date: Re: [ddlm-group] Second proposal to allow looping of'Set' categories
- Prev by thread: [ddlm-group] Second proposal to allow looping of 'Set' categories
- Next by thread: Re: [ddlm-group] Second proposal to allow looping of'Set' categories