[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. .
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Thu, 20 Jan 2011 15:10:28 -0500 (EST)
- In-Reply-To: <8F77913624F7524AACD2A92EAF3BFA54166D7D1ED6@SJMEMXMBS11.stjude.sjcrh.local>
- References: <AANLkTikZoEF_D+5-3+Eg4pbCx0cAu+SJvR-a_XkC3zK2@mail.gmail.com><alpine.BSF.2.00.1101190833560.91751@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ECE@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1101191042290.42382@epsilon.pair.com><4D371BE7.3050501@mcmaster.ca><alpine.BSF.2.00.1101191234221.42382@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ED0@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1101191632410.65107@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ED1@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1101191855500.30768@epsilon.pair.com><AANLkTi=xn2ntdNTvdTBKQQTsJhCQFbKcxceJ1C_u1oOf@mail.gmail.com><alpine.BSF.2.00.1101200440460.66943@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ED6@SJMEMXMBS11.stjude.sjcrh.local>
Dear John B. The existing alias mechanism has been used in the past to carry actual information about tags from entry to entry and dictionary to dictionary, allowing one side or the other of such a definition to be less complete than it might otherwise have been, but also creating the sort of conflicts noted. Any software aside, it is certainly confusing to users of the actual dictionaries to have such conflicting information around. Just think of this as going back to Codd and trying to make sure we only have to update our data in one place and not several. It reduces the chance of error to gather the information about a group of tags with the same meaning, but different names in one place. As David and John W. note, there are important differences in data file construction rules, dare I say "styles" for DDL1, DDL2 and DDLm dictionary-based data files. However, once we have a tag-by-tag style identification so we get the right tags, there is also no reason we cannot give our validators and writers knowledge of which of the several over-all styles we are trying to conform to, even the ever troubling difference in approach as to what can and cannot be looped, and whether the use of the period for category identification is mandatory (DDL2), optional (DDLm) or not normally used (DDL1) David has made a nice case for even more precise detail in the alias category. Rather than trying to overload the URI or depend on other external resources, I urge that we provide the details we need in the alias category in the alias catgeory. Extending the key is an interesting issue. We may need to add the implicit concept from DDL2 to DDLm or to make a subcategory. > I think you slightly mischaracterize the problem here, which is to > maintain and use two or more related sub-dictionaries within the > framework of one DDLm dictionary. The fact that one uses DDL1 formalism > and the other DDL2 formalism is a distinction that could be used in this > particular case, but not in general. There is no reason to suppose that > any distinction weaker than dictionary identifiers (i.e. > _alias.dictionary_uri) would suffice. Umm, I really don't follow your logic. I agree that DDL1 versus DDL2 is _not_ a sufficient range of possible distinctions. How does it then follow that the dictionary_uri will do the job. Especially once we start acquiring multipurpose dictionaries, the dictionary_uri becomes impossible to use for this purpose, and very confusing it that uri leads to other uris. It is a lot simpler to just put the information needed directly in the alias category. The rest of what you are saying is that, in addition to using the dictionary_uri as a real URI, it should also be overloaded as a style, and that all dictionaries are registered with the IUCr, so no confusion will result. I just checked the IUCr web page and it does not have the very critically important PDBx dictionary from wwPDB, and with the DDLm import mechanism we are likely to end up with a very large number of cached variants of dictionaries in various states of assembly. Why burden the IUCr with trying to untangle all that just to avoid putting the real information we need (or at least I need) in the alias category. I would suggest we give David the extra alias category tags he is asking for as well as my tag_style identifier, with reasonable default assumptions when they are not used. If I fail in what David calls my "noble" goal in DDLm-ing the imgCIF dictionary and he fails in whatever use he makes of the extended alias catgeory, what will it have cost those who choose not to use these features in their dictionaries. If we succeed on the other hand, you will have a few extra useful tools you might decide to use in the future. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Thu, 20 Jan 2011, Bollinger, John C wrote: > > On Thursday, January 20, 2011 4:09 AM, Herbert J. Bernstein wrote: > >> If a DDLm dictionary is to be a fully functional replacement for, >> say, a DDL1 dictionary, a dictionary against which one can validate the >> use of purely DDL1 tags, we need a way to not only specify the desired >> DDL1 tag as an alias to the DDLm tag used in the dictionary, but also >> to specify that we do _not_ want to accept the DDLm tag used as the >> save frame name as the valud name. As David has noted, in order not to >> still be maintaining both a DDL1 and a DDLm dictionary, we want this >> information _in_ the DDLm dictionary, so simply aliasing back to some >> other > DDL1 dictionary to use it as a way to say -- "use that dictionary URI as > the style indicator" is suboptimal. Worse, it is a source of future > errors and confusion in that it is defining properties of the tag that > may end up disagreeing with the properties we wish to actually have that > we defined in the DDLm dictionary. > > > We appear to have an important difference of understanding here, which > requires a clarification of the semantics of the DDLm aliasing. My > inference from the somewhat terse definitions of these attributes is > that _alias.dictionary_uri serves primarily as provenance information > and a namespace identifier. I do not take it that the actual > definition, if any, in the referenced dictionary is intended to > contribute anything to the DDLm definition of the item. The DDLm > definition is self-contained, and it is the responsibility of the author > of the DDLm dictionary to ensure that his definition is consistent with > those of the aliased items. To the extent that there are any dictionary > compatibility problems involved, we already have them. > > >> OK, so far, so good -- all we need then is John B.'s tag-by-tag style >> preference flag to say, for this dictionary we want to be DDL1'ish. > > > I don't think you are interpreting my suggestion as I intended, but I'll > comment more fully on that elsewhere, if necessary. > > >> Ah, but now we say, we are in the situation of maintaining the core >> (David's problem) in which we have to maintain a dictionary for >> validation against both DDL1 and DDL2 tag names. Now there are times >> when we wish the DDL1 alias to be the preferred alias and for both the >> DDL2 and DDLm tags to fail a validation check and other times when we >> wish the DDL2 alias to be the prefeered alias and for both the DDL1 and >> DDLm tags to fail a validation check. Now it becomes simpler to just >> have a common style key, such as "DDL1" or "DDL2" and to select just >> the way we do for alternate conformers on that key. > > > I think you slightly mischaracterize the problem here, which is to > maintain and use two or more related sub-dictionaries within the > framework of one DDLm dictionary. The fact that one uses DDL1 formalism > and the other DDL2 formalism is a distinction that could be used in this > particular case, but not in general. There is no reason to suppose that > any distinction weaker than dictionary identifiers (i.e. > _alias.dictionary_uri) would suffice. > > Consider, for instance, bringing the symmetry dictionary also into the > combined, DDLm-form core and mmCIF dictionary. If I want to select only > symCIF aliases, or only mmCIF aliases for that matter, then how would I > do it? I could define tag_styles that serve, but at that point those > tag styles are filling exactly the same role that the dictionary URIs > could and would. Dictionary URIs will always suffice for this > particular job, however, because they express exactly the distinction > that is required. > > That's why I asked for use cases that don't map onto distinguishing tags > based on dictionary or attributes. > > >> OK, that was not so bad, but now we are at, say, the PDB and in >> addition to having DDL1 and DDL2 style tags from the core, we also have >> prefixed tags (pdbx) that should eventually get promoted to be >> prefix-free. Now we can use the styles to validate for strict use of >> the prefixes when we are producing output that we want to be certain >> actually does use the prefixes, or relax the validation to allow both >> the prefixed and promoted tags, or go strict again on the far side to >> be sure be are only producting promoted tags. > > > So here we are getting into potential use cases such as I had requested, > but this one by itself doesn't yet persuade me. As I study the topic, I > suspect I am becoming harder to persuade. I have realized that "group > of tags" is a fairly good minimal description of "dictionary" in the > sense that we are (I am) using the term, so I am having more difficulty > seeing the tag_style proposal as introducing anything new. In this > particular case, I don't see why pdbx tag aliases should not anyway have > their own distinguishing dictionary_uri, and if they did, I don't see > why that dictionary URI would not support all of the proposed operations > just as well as tag_style might. Even if they didn't, the pdbx alias is > a characteristic of the tags themselves, so there are multiple > straightforward ways that an application could perform the PDB-specific > validation you describe without relying on tag_style or dictionary_uri. > > >> Note that none of these style based input validation choices are based >> on the choice of dictionary -- it is one dictionary, so it does not >> really help to be maintaining the styles dictionary by dictionary. >> The grain of identification is too coarse, and involves multiple >> maintenance issues when in reality only one, nice new, DDLm dictionary >> needs to be maintained. > > > I see no maintenance issue here. The DDLm dictionary could indeed be > the only one maintained, and to the extent that stand-alone versions of > its sub-dictionaries were desired, they could be generated > programmatically from the DDLm version -- provided that we retain a > mechanism for identifying which aliases represent tags in which > sub-dictionary. The _alias.dictionary_uri attribute does that nicely. > > To some extent, this argument seems to revolve around the idea of a > dictionary URI necessarily referring to a physical, independently > addressable dictionary. As I said before, I see no reason to place that > limitation on the item's use. Not restricting it in that way would > provide considerable freedom, and I propose that we in fact do clarify > that DDLm places no such restriction. > > To the limited extent that there might be any need to retrieve the > source dictionaries of aliases, the IUCr dictionary register already > provides a mechanism for doing so. If it were desired to record that > information directly in DDLm dictionaries, then it would be useful to > distinguish identifier from location, as XML Schema does (namespace URI > vs. schema location), and to record the location associated with each > identifier once per DDLm dictionary rather than at every use of the > identifier. > > >> On the output side, essentially the same issues arise, but there are >> fewer users, but as I said, it is a harmless addition to the DDLm spec >> for those who do not wish to be aware of it, and for those of use for >> whom it is useful, it really is useful. > > > On the output side, the same congruence between tag_style and > dictionary_uri still applies. > > Users do not need to be aware of the feature to be negatively affected > by its inclusion. > > As for whether it is useful, as far as I am concerned that depends on > whether an essential and useful difference between tag_style and > dictionary_uri can be drawn. So far, all the proposed uses that have > been raised seem natural fits for dictionary_uri. > > >> The fundamental diagreement is on whether we will have to have a DDL1 >> dictionary, a DDL2 dictionary, a DDLm dictionary, a prefix dictionary, >> etc., and plant them on assorted web sites, or just one DDLm dictionary >> that handles everything and can be local or remote or in local and >> remote pieces without changing the behavior of the validation or of the >> output. > > > No, I think at this point the fundamental disagreement is about the > meaning and semantics of _alias.dictionary_uri. As I conceive it, use > of a dictionary URI to group aliases serves every purpose so far > proposed for tag_style, cleanly and naturally, *and does not require or > imply independent existence or maintenance of any other dictionary*. > It is in fact exceedingly similar to my current understanding of > tag_style, especially when tag_style is paired with a registry of > allowed values. Thus arises my strengthening objection to adding a new > attribute that to me appears redundant. > > Let's settle the question of _alias.dictionary_uri first. That will be > worthwhile in its own right, and the results will bear directly on > whether a new attribute is warranted. Specifically, I propose the > following DDLm changes: > > 1) In the ALIAS category, attribute _category_key.generic is replaced > by: > _category_key.primitive [ '_alias.dictionary_uri' '_alias.definition_id' ] > > This is useful to cover cases where the same data name appears in more > than one dictionary, and we want to mark both appearances as aliases of > the defined item. I raise it in this context, however, because it > emphasizes _alias.dictionary_uri's use as a namespace for the alias. > In conjunction with this change, it might be appropriate to change the > _type.purpose value for one of these tags. > > 2) The definition text for _alias.dictionary_uri is amended to > "Specifies the universal resource identifier of the abstract or physical > dictionary containing the definition of an item aliased to the item in > the current definition. This serves to categorize and fully identify > the alias, but does not imply that the URI can be used to retrieve a > physical dictionary defining it." > > This clarifies the attribute's meaning in the direction that makes the > most sense to me. Alternative clarifications are possible, of course. > > > Best Regards, > > John > -- > John C. Bollinger, Ph.D. > Department of Structural Biology > St. Jude Children's Research Hospital > > > Email Disclaimer: www.stjude.org/emaildisclaimer > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. .. . (Bollinger, John C)
- References:
- Re: [ddlm-group] Objectives of CIF2 syntax discussion (James Hester)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (Bollinger, John C)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (David Brown)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. . (Bollinger, John C)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (Bollinger, John C)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (James Hester)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. . (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- Next by Date: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- Prev by thread: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. .
- Next by thread: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. .. .
- Index(es):