Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .. .


On Thursday, January 20, 2011 4:09 AM, Herbert J. Bernstein wrote:

>   If a DDLm dictionary is to be a fully functional replacement for, say, a DDL1 dictionary, a dictionary against which one can validate the use of purely DDL1 tags, we need a way to not only specify the desired DDL1 tag as an alias to the DDLm tag used in the dictionary, but also to specify that we do _not_ want to accept the DDLm tag used as the save frame name as the valud name.  As David has noted, in order not to still be maintaining both a DDL1 and a DDLm dictionary, we want this information _in_ the DDLm dictionary, so simply aliasing back to some other
DDL1 dictionary to use it as a way to say -- "use that dictionary URI as the style indicator" is suboptimal. Worse, it is a source of future errors and confusion in that it is defining properties of the tag that may end up disagreeing with the properties we wish to actually have that we defined in the DDLm dictionary.


We appear to have an important difference of understanding here, which requires a clarification of the semantics of the DDLm aliasing.  My inference from the somewhat terse definitions of these attributes is that _alias.dictionary_uri serves primarily as provenance information and a namespace identifier.  I do not take it that the actual definition, if any, in the referenced dictionary is intended to contribute anything to the DDLm definition of the item.  The DDLm definition is self-contained, and it is the responsibility of the author of the DDLm dictionary to ensure that his definition is consistent with those of the aliased items.  To the extent that there are any dictionary compatibility problems involved, we already have them.


>OK, so far, so good -- all we need then is John B.'s tag-by-tag style preference flag to say, for this dictionary we want to be DDL1'ish.


I don't think you are interpreting my suggestion as I intended, but I'll comment more fully on that elsewhere, if necessary.


>Ah, but now we say, we are in the situation of maintaining the core (David's problem) in which we have to maintain a dictionary for validation against both DDL1 and DDL2 tag names.  Now there are times when we wish the DDL1 alias to be the preferred alias and for both the DDL2 and DDLm tags to fail a validation check and other times when we wish the DDL2 alias to be the prefeered alias and for both the DDL1 and DDLm tags to fail a validation check.
>Now it becomes simpler to just have a common style key, such as "DDL1" or "DDL2" and to select just the way we do for alternate conformers on that key.


I think you slightly mischaracterize the problem here, which is to maintain and use two or more related sub-dictionaries within the framework of one DDLm dictionary.  The fact that one uses DDL1 formalism and the other DDL2 formalism is a distinction that could be used in this particular case, but not in general.  There is no reason to suppose that any distinction weaker than dictionary identifiers (i.e. _alias.dictionary_uri) would suffice.

Consider, for instance, bringing the symmetry dictionary also into the combined, DDLm-form core and mmCIF dictionary.  If I want to select only symCIF aliases, or only mmCIF aliases for that matter, then how would I do it?  I could define tag_styles that serve, but at that point those tag styles are filling exactly the same role that the dictionary URIs could and would.  Dictionary URIs will always suffice for this particular job, however, because they express exactly the distinction that is required.

That's why I asked for use cases that don't map onto distinguishing tags based on dictionary or attributes.


>OK, that was not so bad, but now we are at, say, the PDB and in addition to having DDL1 and DDL2 style tags from the core, we also have prefixed tags (pdbx) that should eventually get promoted to be prefix-free.  Now we can use the styles to validate for strict use of the prefixes when we are producing output that we want to be certain actually does use the prefixes, or relax the validation to allow both the prefixed and promoted tags, or go strict again on the far side to be sure be are only producting promoted tags.


So here we are getting into potential use cases such as I had requested, but this one by itself doesn't yet persuade me.  As I study the topic, I suspect I am becoming harder to persuade.  I have realized that "group of tags" is a fairly good minimal description of "dictionary" in the sense that we are (I am) using the term, so I am having more difficulty seeing the tag_style proposal as introducing anything new.  In this particular case, I don't see why pdbx tag aliases should not anyway have their own distinguishing dictionary_uri, and if they did, I don't see why that dictionary URI would not support all of the proposed operations just as well as tag_style might.  Even if they didn't, the pdbx alias is a characteristic of the tags themselves, so there are multiple straightforward ways that an application could perform the PDB-specific validation you describe without relying on tag_style or dictionary_uri.


>Note that none of these style based input validation choices are based on the choice of dictionary -- it is one dictionary, so it does not really help to be maintaining the styles dictionary by dictionary.  The grain of identification is too coarse, and involves multiple maintenance issues when in reality only one, nice new, DDLm dictionary needs to be maintained.


I see no maintenance issue here.  The DDLm dictionary could indeed be the only one maintained, and to the extent that stand-alone versions of its sub-dictionaries were desired, they could be generated programmatically from the DDLm version -- provided that we retain a mechanism for identifying which aliases represent tags in which sub-dictionary.  The _alias.dictionary_uri attribute does that nicely.

To some extent, this argument seems to revolve around the idea of a dictionary URI necessarily referring to a physical, independently addressable dictionary.  As I said before, I see no reason to place that limitation on the item's use.  Not restricting it in that way would provide considerable freedom, and I propose that we in fact do clarify that DDLm places no such restriction.

To the limited extent that there might be any need to retrieve the source dictionaries of aliases, the IUCr dictionary register already provides a mechanism for doing so.  If it were desired to record that information directly in DDLm dictionaries, then it would be useful to distinguish identifier from location, as XML Schema does (namespace URI vs. schema location), and to record the location associated with each identifier once per DDLm dictionary rather than at every use of the identifier.


>On the output side, essentially the same issues arise, but there are fewer users, but as I said, it is a harmless addition to the DDLm spec for those who do not wish to be aware of it, and for those of use for whom it is useful, it really is useful.


On the output side, the same congruence between tag_style and dictionary_uri still applies.

Users do not need to be aware of the feature to be negatively affected by its inclusion.

As for whether it is useful, as far as I am concerned that depends on whether an essential and useful difference between tag_style and dictionary_uri can be drawn.  So far, all the proposed uses that have been raised seem natural fits for dictionary_uri.


> The fundamental diagreement is on whether we will have to have a DDL1 dictionary, a DDL2 dictionary, a DDLm dictionary, a prefix dictionary, etc., and plant them on assorted web sites, or just one DDLm dictionary that handles everything and can be local or remote or in local and remote pieces without changing the behavior of the validation or of the output.


No, I think at this point the fundamental disagreement is about the meaning and semantics of _alias.dictionary_uri.  As I conceive it, use of a dictionary URI to group aliases serves every purpose so far proposed for tag_style, cleanly and naturally, *and does not require or imply independent existence or maintenance of any other dictionary*.  It is in fact exceedingly similar to my current understanding of tag_style, especially when tag_style is paired with a registry of allowed values.  Thus arises my strengthening objection to adding a new attribute that to me appears redundant.

Let's settle the question of _alias.dictionary_uri first.  That will be worthwhile in its own right, and the results will bear directly on whether a new attribute is warranted.  Specifically, I propose the following DDLm changes:

1) In the ALIAS category, attribute _category_key.generic is replaced by:
    _category_key.primitive [ '_alias.dictionary_uri' '_alias.definition_id' ]

This is useful to cover cases where the same data name appears in more than one dictionary, and we want to mark both appearances as aliases of the defined item.  I raise it in this context, however, because it emphasizes _alias.dictionary_uri's use as a namespace for the alias.  In conjunction with this change, it might be appropriate to change the _type.purpose value for one of these tags.

2) The definition text for _alias.dictionary_uri is amended to "Specifies the universal resource identifier of the abstract or physical dictionary containing the definition of an item aliased to the item in the current definition.  This serves to categorize and fully identify the alias, but does not imply that the URI can be used to retrieve a physical dictionary defining it."

This clarifies the attribute's meaning in the direction that makes the most sense to me.  Alternative clarifications are possible, of course.


Best Regards,

John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital


Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.