Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] DDLm aliases (subject changed). .

On Saturday, January 22, 2011 7:19 PM, James Hester wrote:

>I believe this discussion arose out of a misconception, but will end
>up producing something useful.  First of all, we should be clear that
>the only well-defined meaning of "DDL1/DDL2/DDLm tag" is "a dataname
>defined in a dictionary written using DDL1/DDL2/DDLm".  Note in
>particular that all DDL1 and DDL2 tags are consistent with CIF1
>syntax, and writing a DDLm dictionary with CIF1-compatible tags is
>also not troublesome.


I take that to mean something like "any CIF1-compatible tag can be defined in a DDLm dictionary," which is true.  Of course, tags that do not comply with dREL conventions cannot be referenced by methods, at least not directly.  Depending on the semantics of ALIAS (or some similar mechanism for associating data names), however, we could make it possible to reference them indirectly through a dREL-compatible name.  It might work something like this:

====
save_diffrn_standards.decay_percent
    _definition.id             '_diffrn_standards.decay_percent'

[...]

    loop_
        _alias.xref_code
        _alias.definition_id
        _alias.dictionary_version
        _alias.deprecated
        .  '_diffrn_standards_decay_%' . yes

save_
====

The key there is use of an _alias.xref_code value that signifies a data name logically defined in the same dictionary, rather than in an external one.  (The example uses . for that purpose, but there are alternatives.)  The result would be that _diffrn_standards.decay_percent and _diffrn_standards_decay_% are aliased in roughly the same sense as identifiers in a programming language can be.

We could go even further by saying that an alias into the current dictionary does not require (perhaps should not have) a separate definition from that of the item to which it is aliased.  That way opportunities for inconsistency within a single dictionary could be foreclosed.


>  This means that it is simple to write a DDLm
>imgCIF or coreCIF dictionary where datanames satisfy CIF1 syntax
>rules.  A datafile in CIF1 syntax can then refer to the DDLm
>dictionary as the reference for the datanames.
>
>On the other hand, it is not possible to write a DDLm dictionary that
>can serve as a DDL1 or DDL2 dictionary, because the DDL languages are
>different and incompatible.
>
>  Simply rewriting the tags does not change
>the fact that the tag is defined in a DDLm dictionary and therefore is
>interpretable using DDLm semantics *only*. The concept of a virtual
>dictionary generated from a "master" DDLm dictionary but with
>DDLm/DDL1/DDL2 flavours is therefore meaningless and should be
>abandoned.


I don't yet see what validity constraints can be expressed using DDL1 or DDL2 that cannot be expressed using DDLm, or what validity constraints are inherently implied by DDLm definitions that are not implied by DDL1 or DDL2 definitions.  Absent distinctions these kinds, it's not clear to me why an existing DDL1 or DDL2 dictionary could not be rewritten in DDLm format.  Indeed, I take it as an important objective for DDLm that there be no obstacle to such rewriting.

What else does it mean, then, that a DDLm dictionary cannot serve as a DDL1 or DDL2 dictionary?

Suppose that we perform such a rewrite from (say) a DDL1 dictionary, D1, to a DDLm dictionary, Dm.  As part of this process we record for each defined name an alias to the corresponding D1 definition, even though the two data names are the same.  By assumption, Dm is equivalent to D1, at least with respect to the CIFs it validates.

Now suppose that we create a new dictionary, Dm', based on all the aliases defined in Dm.  For each alias we write a definition for the aliased name by copying the Dm definition in which the alias appears, except substituting the alias data name for the defined data name.  Because all the aliased data names are the same as the associated defined names, however, Dm' is identical to Dm.  Thus it, too, is equivalent to D1 for validation purposes.

But what if we create a third DDLm dictionary, Dm'', by adding some definitions having no aliases to D1, by changing some of Dm's defined data names without otherwise altering their definitions, and by adding aliases into some other dictionary to some of the defined items.  Dm'' is *not* equivalent to D1, at least because it will accept some data names that D1 does not, and it will reject some data names that D1 accepts.  But if we apply the same procedure described in the previous paragraph to generate a new dictionary from the aliases defined in Dm'', selecting only those aliasing into D1 and stripping aliases to other dictionaries from the result, then we regenerate Dm', which is equivalent to D1 for validation purposes.

Wouldn't it then be reasonable to describe D1 as a virtual DDL1 dictionary resident in or generated from Dm''?


Now consider a CIF originally written and valid against the DDL2 mmCIF dictionary, and a DDLm dictionary containing mmCIF as a virtual dictionary.  We can use dREL methods to generate additional values, and we can use the defined aliases to map those values to their mmCIF data names, if they have any.  We can test whether those data that have corresponding mmCIF names are collectively valid against the virtual mmCIF dictionary, we can determine what other data may be needed for validity, and we can look for dREL methods in the host dictionary to generate any such.  After all data manipulations are complete, we can output the data using the mmCIF data names, omitting any data not defined by mmCIF, and knowing (if we wish) whether the resulting data set is valid against mmCIF.  These are useful things to do, and they require little that DDLm does not already provide.


I do not argue that converting a CIF that is valid against a host DDLm dictionary to a CIF valid against some virtual dictionary resident in the host would be as simple as merely translating data names.  If the host properly embeds the virtual dictionary as described above, however, then it's not obvious to me that it would not be.  I shall have to think on that.


It is a different question whether a single DDLm dictionary can simultaneously provide the semantics of both a DDL1 dictionary and an overlapping DDL2 dictionary, as Herbert proposes.  David also expressed doubts about the feasibility of that proposition.  Surely, however, the prospects at least depend on the contents of the DDL1 and DDL2 dictionaries that are proposed to be co-expressed this way.  I can easily produce a trivial example that works, and Herbert is by far the best judge of how well imgCIF is suited to such treatment.


This is already quite long, so I will respond separately to the other part of James's comments.


Regards,

John

--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital


Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.