Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] DDLm aliases (subject changed). .

On Saturday, January 22, 2011 7:19 PM, James Hester wrote:

>Nevertheless, an important use case for rewriting tags has been
>identified by Herbert: transitioning from the use of tags with a local
>identifier to those using a "global" (ie no namespace) identifier.


Indeed, local identifier prefixes map very neatly onto the concept for which I propose _alias.xref_code.  Both correlate well with XML qualified names.


>With something like the tag_style proposal in place, the DDLm
>dictionary writer can write the dictionary as if it were a global
>dictionary (this may particularly help with dREL methods) and include
>a "local" tag_style which gives an alternate dataname that includes
>the local section. In tandem with this, any datafiles containing
>datanames defined in this local dictionary would use the audit
>category to specify both a dictionary *and* a style.


Other than the specific categories used to convey data name associations, I'm having trouble seeing how this would not be covered by my own proposal.  Especially if aliases are allowed to refer to definitions in the same dictionary, rather than to external definitions only.


>  If only "local"
>datanames are in use, then the style would be "local"; if the
>dictionary becomes a standard, no rewriting is necessary, and
>datafiles can now just use the default value of style ("standard").  I
>think this is a compelling use case, but still have to think through
>how dictionary merging will work.


And like other uses proposed so far for some kind of tag style, that's equivalent to defining and using a virtual sub-dictionary -- in this case, one comprising "local" tags, or perhaps "local" tags plus others.  This is a good example of a use for categorizing tags based on something unrelated to DDL conformance, however.  In fact, it's a good example of a case where DDL conformance isn't even relevant to the question.


>The second future use case is that of datanames in a DDLm dictionary
>containing non-ASCII code points.  These, and only these, DDLm
>datanames are not CIF1-compatible.  A style could therefore be added
>giving the "ASCII" equivalent dataname.


Sure, but to what use might that information be put that makes it conceptually different from an alias?  How would the collection of such CIF1-compatible names and associated definitions not be well described as a (sub-) dictionary?

And perhaps this is too picky, but why would anyone bother to define ASCII alternative names when they could simply stick to ASCII names in the first place?  That's not to say I don't see non-ASCII data names ever being defined, but rather that it will not be useful or appropriate to define such until the dictionary author decides it is safe to assume that the target audience can consume them as-is.


>As John W was suggesting (at least reading between the lines), the
>above two use cases are semantically distinct from aliases.  Aliases
>point to definitions in a dictionary and state that the aliased
>dataname is the equivalent dataname in a different dictionary.  As the
>dictionary DDL languages may be different, there are no explicit
>guarantees that all semantic properties (e.g. category relationships)
>can be preserved in making this translation.  On the other hand, the
>tag_style use is a simple rewriting of the dataname preserving perfect
>semantic identity.


Well, that's entirely up to COMCIFS, isn't it?  And by extension, to us?  Moreover, it's a distinction that cannot be drawn on a per-item basis, relying as it does on whether relationships are modeled equivalently across multiple items.  It could be drawn at the dictionary_xref level or the equivalent.  Even at that level, however, it's not clear to me that the DDLm relational model needs to be able to express the difference.  After all, the user of the dictionary needs to know something about the alias/whatever sets to determine how to use each.

If DDLm does make the distinction, it's not clear to me that the best approach would be to create a whole separate category, as opposed to adding one attribute to the existing category, or even to relying on attributes that are already present.


There is one distinction that I think might be more appropriate, and perhaps that's where this is going: whether an alternative name is intended to be defined in the current (DDLm) dictionary, or whether it is a reference into a logically or physically separate dictionary.  That might especially be the case if we want to provide for coalescing the definitions of equivalent items.  For example, if a data name has been deprecated and replaced with a preferred, equivalent alternative, then do we need or want dictionary writers to provide two distinct definitions?  If they did provide distinct definitions, is there a way to define the appropriate relationship between them, as there is an DDL1 and DDL2?

I think that's a different case from local and ASCII.  For those latter use cases to make sense, I think they have to provide *exclusive* alternative to their associated data names, and they would best be used on a per-CIF, all-or-nothing basis.  It isn't reasonable to allow both forms of a given name to be used in the same CIF, at least in the same data block or save frame.  Over the whole CIF, it defeats the purpose to use, say, the ASCII version of some data names but the non-ASCII version of others.  To me, then, these particular cases need to establish alternative logical dictionaries.



>Therefore, I believe that the tag_style tag should not be conflated
>with aliases, but should be created in a separate category.  Note also
>that "local" and "ASCII" are not mutually exclusive designations, so
>some further work is necessary to get everything to work together
>properly (e.g. how do I transition between "local + ASCII", "local",
>"ASCII" and "standard+ASCII"?).  I also think that "style" is probably
>not the best terminology to use - perhaps "presentation" or "view"
>would be better.


It's a shame that "alias" is already taken :^)

Really, though, if you take away the possibility of DDLm aliases serving as "presentation" mappings, then "cross reference" is a better description of them than "alias" is.  What James describes as different tag styles is exactly what I'd be inclined to describe as aliasing.


Wacky thought: is it possible to put a method at dictionary level?  For example, one that converts data names to or from different presentations?  Could such a feature be used to generate "presentation" mappings or otherwise to modify the dictionary itself dynamically?  That wouldn't necessarily work generally for embedding dictionaries, but it might work for local vs. standard +- ASCII.


John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital




Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.