[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] DDLm aliases (subject changed)

Dear Colleagues,

   I can find almost nothing in James side comments about DDLm versus
DDL2 versus DDL1 with which I agree.  I really do believe that
I will be able to write a single DDLm dictionary and supporting
software that will be able to validate current imgCIF files against
DDL2 rules, a CIF2 imgCIF file against DDLm rules, and, hopefully
a new DDL1 imgCIF file against DDL1 rules.  Thus far I have not
found anything in DDLm that will prevent a DDLm dictionary with
a few added tags (such as something of the sort David and I have
proposed) from carrying all the information needed for this purpose,
but I am more familiar with DDL2 than with DDL1.  Perhaps James
is referring to some inherent conflict between DDLm and DDL1,
but I really thought the intent of DDLm was to allow it to become
the dictionary definition language for dictionaries that could
be used to validate _all_ archived DDL1 and DDL2 data files.

   Certainly, in using that dictionary for DDL2 validation, I will
be using it in a way that is not consistent with all DDLm rules, e.g.
by not honoring the list restriction-- but that is precisely the point.
Most of what is in DDLm looks very similar to what is already in
DDL2, so that specialization look to me to very doable, and DDLm
seems to have support of the major concepts, such as the list flag
from DDL1.

   So right now, I have no idea what misconception James is referring to,
but does it really matter?  I will either succeed, or I will fail,
but the additions I have asked for will not do any harm to those
who do not use them.

   I can cope either with David's and John's approach of
putting the grouping information in with the alias information,
or with James wish to go back to my original proposal to
have the group information separated out.  I just need a
way to organize the grouping relationship among stylistically
different tags with the same meanings.   If you all prefer,
I can just work up  an imgCIF DDLm dictionary using
a modified DDLm with a few added tags using the prefix mechanism.
That might help in understanding the issues I am looking at
without trespassing on what is or is not formally part of DDLm.
For the moment I would put the mapping information into
a separate category, but make it joinable to alias, so it
can be used either the way John B. has proposed or separately
as James has proposed.  The only point that seems to be a
real sticking point is whether the grouping identifier should
be conflated with the xref code or not.  James seem to feel
very strongly that it should be distinct.  John B. argued very
strongly that the grouping identifier had to be conflated with
the xref code.  I think it is clearer to make it distinct, but
that, other than being a little confusing, John B.'s approach
does no real harm.

   What do other people think about conflating the grouping
identifier with the xref code?  I'll go along with whatever
the majority chooses on that issue.

   Regards,
     Herbert


At 12:19 PM +1100 1/23/11, James Hester wrote:
>I believe this discussion arose out of a misconception, but will end
>up producing something useful.  First of all, we should be clear that
>the only well-defined meaning of "DDL1/DDL2/DDLm tag" is "a dataname
>defined in a dictionary written using DDL1/DDL2/DDLm".  Note in
>particular that all DDL1 and DDL2 tags are consistent with CIF1
>syntax, and writing a DDLm dictionary with CIF1-compatible tags is
>also not troublesome.  This means that it is simple to write a DDLm
>imgCIF or coreCIF dictionary where datanames satisfy CIF1 syntax
>rules.  A datafile in CIF1 syntax can then refer to the DDLm
>dictionary as the reference for the datanames.
>
>On the other hand, it is not possible to write a DDLm dictionary that
>can serve as a DDL1 or DDL2 dictionary, because the DDL languages are
>different and incompatible.  Simply rewriting the tags does not change
>the fact that the tag is defined in a DDLm dictionary and therefore is
>interpretable using DDLm semantics *only*. The concept of a virtual
>dictionary generated from a "master" DDLm dictionary but with
>DDLm/DDL1/DDL2 flavours is therefore meaningless and should be
>abandoned.
>
>Nevertheless, an important use case for rewriting tags has been
>identified by Herbert: transitioning from the use of tags with a local
>identifier to those using a "global" (ie no namespace) identifier.
>With something like the tag_style proposal in place, the DDLm
>dictionary writer can write the dictionary as if it were a global
>dictionary (this may particularly help with dREL methods) and include
>a "local" tag_style which gives an alternate dataname that includes
>the local section. In tandem with this, any datafiles containing
>datanames defined in this local dictionary would use the audit
>category to specify both a dictionary *and* a style.  If only "local"
>datanames are in use, then the style would be "local"; if the
>dictionary becomes a standard, no rewriting is necessary, and
>datafiles can now just use the default value of style ("standard").  I
>think this is a compelling use case, but still have to think through
>how dictionary merging will work.
>
>The second future use case is that of datanames in a DDLm dictionary
>containing non-ASCII code points.  These, and only these, DDLm
>datanames are not CIF1-compatible.  A style could therefore be added
>giving the "ASCII" equivalent dataname.
>
>As John W was suggesting (at least reading between the lines), the
>above two use cases are semantically distinct from aliases.  Aliases
>point to definitions in a dictionary and state that the aliased
>dataname is the equivalent dataname in a different dictionary.  As the
>dictionary DDL languages may be different, there are no explicit
>guarantees that all semantic properties (e.g. category relationships)
>can be preserved in making this translation.  On the other hand, the
>tag_style use is a simple rewriting of the dataname preserving perfect
>semantic identity.
>
>Therefore, I believe that the tag_style tag should not be conflated
>with aliases, but should be created in a separate category.  Note also
>that "local" and "ASCII" are not mutually exclusive designations, so
>some further work is necessary to get everything to work together
>properly (e.g. how do I transition between "local + ASCII", "local",
>"ASCII" and "standard+ASCII"?).  I also think that "style" is probably
>not the best terminology to use - perhaps "presentation" or "view"
>would be better.
>
>I have so far no objection in principle to normalising out the
>dictionary using dictionary_xref as John has proposed.
>
>On Sat, Jan 22, 2011 at 7:47 AM, Herbert J. Bernstein
><yaya@bernstein-plus-sons.com> wrote:
>>  This can be made to work, but for my uses, there are
>>  some minor issues:
>>
>>  1.  I will be grouping the primary DDLm tag.  With the
>>  _definition.xref_code removed, the primary DDLm tag
>>  will have to be aliased; and
>>
>>  2.  With multiple xref codes for a given tag (e.g.
>>  DDL2 and DDLm), it would be more appropriate to
>>  normalize and put the tags and xref codes into
>>  a sub-category, rather than to keep repeating the
>>  same tag.  This would have the advantage of allowing
>>  the alias category to return to a non-compound key
>>  and would also allow all the grouping of
>>  tags in a dictionary to be gathered on a separate
>>  block, if desired.
>>
>>  For these reasons, I suggest
>>
>>  1.  Leave _alias.dictionary_uri, but deprecate it in
>>  favor of:
>>
>>  2.  Create an ALIAS_XREF category with the
>>  following tags, forming a composite key
>>
>>  _alias_group.definition_id
>>      a tag identifier belonging to a group
>>  _alias_group.xref_code
>>      a code identifying a real or virtual dictionary
>>  or other logical groups of tags to which the tag
>>  belongs
>>
>>  The other tags that John proposes for David's uses
>>  actually fit better in terms of normalization in this sub-
>>  category, than on the top level, but that is a decision
>>  for David to make.  I am happy either way.
>  >
>>  The addition to the ddl dictionary would be:
>>
>>  save_ALIAS_XREF
>>
>>    _definition.id      alias_xref
>>    _definition.scope   Category
>>    _definition.class   List
>>    _definition.update  2011-01-21
>>  ;
>>     The attributes used to specify the actual dictionary,
>>     virtual dictionary, or other logical grouping of
>>     tags indicated by an xref code to which a given tag belong.
>>
>>     The default xref code under which all tags for which
>>     no xref group is defined is the one specified by
>>     a null value.
>>
>>  ;
>>     _category.parent_id  alias
>>     _category_key.primitive  ['_alias_xref.definition_id',
>>                               '_alias_xref.xref_code']
>>      save_
>>
>>  save_alias_xref.definition_id
>>      _definition.id   '_alias_xref.definition_id'
>>      _definition.class  Attribute
>>      _definition.update 2011-01-21
>>      _description.text
>>  ;
>>      Identifier tag of a definition associated with
>>      an xref code by which to group this tag with
>>      other tags.  A single tags may be associated
>>      with multiple xref codes.  An xref code does
>>      not have to be associated with a particular
>>      dictionary, nor with a particular DDL format.
>>
>>      Note that the tag does not have to be a valid
>>      tag under DDLm tag construction rules, but
>>      it should be a valid tag under the rules of
>>      some DDL.
>>  ;
>>      _name.category_id alias_xref
>>      _name.object_id   definition_id
>>      _type.purpose     Key
>>      _type.container   Single
>>      _type.contents    Code
>>       save_
>>
>>  save_alias_xref.xref_code
>>      _definition.id   '_alias_xref.xref_code'
>>      _definition.class  Attribute
>>      _definition.update 2011-01-21
>>      _description.text
>>  ;
>>      A code identifying the actual dictionary,
>>      virtual dictionary or other logical grouping
>>      to which the identifier tag belongs.
>>  ;
>>      _name.category_id alias_xref
>>      _name.object_id   code
>>      _type.purpose     Key
>>      _type.container   Single
>  >     _type.contents    Code
>>       save_
>>
>>
>>
>>  =====================================================
>>   Herbert J. Bernstein, Professor of Computer Science
>>     Dowling College, Kramer Science Center, KSC 121
>>          Idle Hour Blvd, Oakdale, NY, 11769
>>
>>                   +1-631-244-3035
>>                   yaya@dowling.edu
>>  =====================================================
>>
>>  On Fri, 21 Jan 2011, Bollinger, John C wrote:
>>
>>>
>>>  On Friday, January 21, 2011 8:57 AM, David Brown wrote:
>>>  [...]
>>>>  I would like to know exactly what I am voting on.  There seems to be
>>>>  general agreement on the information that is needed for an alias, the
>>>>  only dispute is the format in which it will appear.  If the various
>>>>  pieces of information I listed each had their own item, this would be
>>>>  agreeable and we could delegate someone to come up with the requisit
>>>>  DDLm save frames, but if this information is to be included, explicitly
>>>>  or implicitly, in a smaller number of items, then I would like to see
>>>>  the definitions and descriptions so that I could understand how each
>>>>  piece of information would be retrieved.  John B, can you supply us
>>>>  with an example of what your normalized item(s) would look like?
>>>
>>>
>>>  Indeed, here is the formal proposal I promised, at the end of which is
>>>  an example:
>>>
>>>
>>>  Proposal: Extended Alias Attributes
>>>  ===================================
>>>
>>>  Introduction / Rationale
>>>  ------------------------
>>>
>>>  This proposal aims primarily to provide all the ALIAS attributes 
>>>that several members of this group have recently agreed are needed 
>>>(at least in principle).  However, attributes that are properties 
>>>of dictionaries rather than of individual data names are 
>>>normalized out of the ALIAS category and into the DICTIONARY_XREF 
>>>category.  The description of the DICTIONARY_XREF category is 
>>>slightly modified to be explicitly consistent with this usage and 
>>>with the concept of referencing logical dictionaries that have no 
>>>independent physical manifestation.
>>>
>>>
>>>  Proposed Actions
>>>  ----------------
>>>
>>>  1) Replace _alias.dictionary_uri with:
>>>
>>>  _alias.xref_code: Specifies a code that identifies the logical or 
>>>physical dictionary in which the alias is defined.  This serves to 
>>>categorize and fully identify the alias.
>  >>    _type.purpose     Identify
>>>     _type.container   Single
>>>     _type.contents    Code
>>>
>>>  2) Add these attributes:
>>>
>>>  _alias.dictionary_version: Specifies the first version of the
>>>  dictionary identified by _alias.xref_code that defines the alias.
>>>     _type.purpose     Identify
>>>     _type.container   Single
>>>     _type.contents    Code
>>>
>>>  _alias.deprecated: Specifies whether use of the alias is deprecated.
>>>     _type.purpose     State
>>>     _type.container   Single
>>>     _type.contents    YesorNo
>>>
>>>  3) In the ALIAS category, replace attribute _category_key.generic with:
>>>     _category_key.primitive [ '_alias.xref_code' '_alias.definition_id' ]
>>>
>>>  4) Modify the definition of _dictionary_xref.format by changing its
>>>  _type.contents attribute to "Code".
>>>
>>>  5) Remove _definition.xref_code (its purpose will be served via the
>>>  alias mechanism)
>>>
>>>  6) Modify the description of the DICTIONARY_XREF category to: "The
>>>  DICTIONARY_XREF attributes identify and describe logical or physical
>>>  dictionaries to which items in the current dictionary are
>>>  cross-referenced using the _alias.xref_code attribute."
>>>
>>>
>>>  Comments
>>>  --------
>>>
>>>  Here is the resulting correspondence between DDLm data names and David's
>>>  list of alias attributes:
>>>
>>>  "The tag" -> _alias.definition_id (unchanged by this proposal)
>>>
>>>  "the dictionary in which it appears" -> a row/instance of
>>>  DICTIONARY_XREF, identified by _alias.xref_code (added by this proposal)
>>>
>>>  "the version of this dictionary" -> _alias.dictionary_version (added by
>>>  this proposal)
>>>
>>>  "the DDL in which the dictionary is written" -> _dictionary_xref.format
>>>  (type attributes modified by this proposal)
>>>
>>>  "a flag to indicate whether the dataname is deprecated" ->
>>>  _alias.deprecated (added by this proposal)
>  >>
>>>  "a pointer to where the named dictionary can be found" ->
>>>  _dictionary_xref.uri (unchanged by this proposal)
>>>
>>>
>>>  Although this proposal chooses the existing DICTIONARY_XREF category as
>>>  the normalized location for alias attributes that depend only on
>>>  dictionary, it would also be possible to instead introduce a new,
>>>  parallel category for this purpose.  If the _definition.xref_code is
>>>  merged into the alias feature as I propose, however, then
>>>  DICTIONARY_XREF no longer has any other purpose.  On the other hand, it
>>>  is not essential to drop _definition.xref_code.
>>>
>>>  As in my previous proposal concerning _alias.dictionary_uri, the key for
>>>  the ALIAS category is expended to a compound one containing the
>>>  dictionary identifier and the data name.  This allows one data name's
>>>  appearances in multiple dictionaries all to be aliased to the same
>>>  defined name, without implying that all possible definitions of the name
>>>  are aliased.  Essentially, it scopes the alias to the dictionary in
>>>  which it appears.  DDL2's similar ITEM_ALIASES category is keyed not
>>>  only to name and dictionary identifier, but also to dictionary version;
>>>  the last seems needless, even in DDL2, because we can assume that once
>>>  introduced into a dictionary, data names are not removed or incompatibly
>>>  changed.
>>>
>>>  The type attributes of _dictionary.xref_format are changed so that this
>>>  attribute represents a computer-interpretable code describing at least
>>>  the DDL compliance level of the referenced dictionary.  Allowed values
>>>  could be defined so that they encompass other information as well, very
>>>  much like the proposed tag_style might do.  It might be desirable for
>>>  DDLm to enumerate allowed values for this attribute, but it would be
>>>  more flexible to have an external register, such as Herbert proposed for
>>>  tag_style.  I presently take no position on the best course in that
>>>  regard, but this proposal does not provide enumerated values.
>>>
>>>  This proposal is offered for comment.  Although I would be willing to
>>>  have a vote on it as it stands, it could likely be improved.  I am open
>>>  to changing some of the details if that will contribute to broader
>>>  acceptance.
>>>
>>>
>>>  Example
>>>  -------
>>>
>>>  loop_
>  >>    _dictionary_xref.code
>>>     _dictionary_xref.date
>>>     _dictionary_xref.format
>>>     _dictionary_xref.name
>>>     _dictionary_xref.uri
>>>     core  '2010-Jun-29'  DDL1  cif_core.dic 
>>> ftp://ftp.iucr.org/pub/cif_core.dic
>>>     mmcif '2005-Jun-27'  DDL2  mmcif_std.dic 
>>>ftp://ftp.iucr.org/pub/cif_mm.dic
>>>
>>>  [...]
>>>
>>>  save_diffrn_standards.decay_percent
>>>     _definition.id             '_diffrn_standards.decay_percent'
>>>
>>>  [...]
>>>
>>>     loop_
>>>         _alias.xref_code
>>>         _alias.definition_id
>>>         _alias.dictionary_version
>>>         _alias.deprecated
>>>         core  '_diffrn_standards_decay_%' . no
>>>         mmcif '_diffrn_standards.decay_%' . no
>>>
>>>  save_
>>>
>>>
>>>  Regards,
>>>
>>>  John
>>>
>>>  --
>>>  John C. Bollinger, Ph.D.
>>>  Department of Structural Biology
>>>  St. Jude Children's Research Hospital
>>>
>>>
>>>  Email Disclaimer:  www.stjude.org/emaildisclaimer
>>>
>>>  _______________________________________________
>>>  ddlm-group mailing list
>>>  ddlm-group@iucr.org
>>>  http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>
>>  _______________________________________________
>>  ddlm-group mailing list
>>  ddlm-group@iucr.org
>>  http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
>
>
>--
>T +61 (02) 9717 9907
>F +61 (02) 9717 3145
>M +61 (04) 0249 4148
>_______________________________________________
>ddlm-group mailing list
>ddlm-group@iucr.org
>http://scripts.iucr.org/mailman/listinfo/ddlm-group


-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]