[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .

Here is the next pass with John W.'s name change and the category
keys aligned.  I would rather have followed the DDLm design philosophy
of not requiring a parent category to know the innards of its children,
but I would not want to hold up agreement on the basic idea over
the detailed technical resolution of denormalization in DDLm.

That being said, we will have to address the denormalization issue
to fully support the realities of macromolecular data processing.

I shudder to point this out, but DDLm also seems to be missing
the concept of implicit tags that is in DDL2 -- not a critical
issue, but it probably should be addressed.   Note that DDLm
only addresses identifying tags that must be present in
a given category so a strict reading would be that, as long
as the value of a key is somehow known (e.g. by having
an enumeration default), it does not have to be physically present
in the data.  I have stretched that point to deal with the
denormalization issue in the case.

  -- Herbert


save_definition.xref_code
       _definition.id             '_definition.xref_code'
       _definition.update           2011-01-26
       _definition.class            Attribute
       _description.text
;
        Code identifying the equivalent definition in the dictionary
        referenced by the DICTIONARY_XREF attributes.

        Use of _definition.xref_code is deprecated in favor of
        use of _alias.xref_code
;
       _name.category_id            definition
       _name.object_id              xref_code
       _type.purpose                Identify  
       _type.container              Single
       _type.contents               Code
        save_

save_ALIAS

       _definition.id               alias
       _definition.scope            Category
       _definition.class            List
       _definition.update           2011-01-26
       _description.text
;
        The attributes used to specify the aliased names of definitions.
        Every tag has an implicit alias to itself with a null
        _alias.xref_code to allow use of the primary tag in
        the ALIAS_IDENTIFIER_SET category.
       
        The use of _alias.identifier_set_id in the key of
        this catgeory is provide a placeholder for the
        to conform the key of the parent ALIAS category
        to the key of the child ALIAS_IDENTIFIER_SET
        for automatic joins.  It is not intended that
        _alias.identifier_set_id should be used in the
        ALIAS category when no join is being done.
       
;
       _category.parent_id          ddl_attr
       _category_key.primitive      ['_alias.definition_id',
                                     '_alias.xref_code',
                                     '_alias.identifier_set_id']
        save_


save_alias.definition_id
       _definition.id             '_alias.definition_id'
       _definition.class            Attribute
       _definition.update           2006-11-16
       _description.text
;
        Identifier tag of an aliased definition.
;
       _name.category_id            alias
       _name.object_id              definition_id
       _type.purpose                Key  
       _type.container              Single
       _type.contents               Tag
        save_

save_alias.deprecated
       _definition.id             '_alias.deprecated'
       _definition.class            Attribute
       _definition.update           2006-11-16
       _description.text
;
        Specifies whether use of the alias is deprecated
;
       _name.category_id            alias
       _name.object_id              definition_id
       _type.purpose                STATE
       _type.container              Single
       _type.contents               YesorNo
       _enumeration.default         No
        save_


save_alias.dictionary_uri
       _definition.id             '_alias.dictionary_uri'
       _definition.update           2011-01-26
       _definition.class            Attribute
       _description.text
;
        Dictionary URI in which the aliased definition belongs.
        _alias.dictionary_uri is deprecated in favor if
        _alias.xref_code
;
       _name.category_id            alias
       _name.object_id              dictionary_uri
       _type.purpose                Identify  
       _type.container              Single
       _type.contents               Uri
        save_
       
save_alias.identifier_set_id
       _definition.id   '_alias.identifier_set.id'
       _definition.class  Attribute
       _definition.update 2011-01-26
       _description.text
;
       A code identifying an identifier_set of related tags.
       This linked item is provided in the ALIAS category to
       ensure that the key of the ALIAS category is
       conformed to the key of the ALIAS_IDENTIFIER_SET
       category.  The alias has not been joined with
       ALIAS_IDENTIFIER_SET, _alias.identifier_set_id
       it is not intended that  _alias.identifier_set_id
       in the ALIAS category.
      
       This is a pointer to _identifier_set.id
;
       _name.category_id alias
       _name.object_id   code
       _name.linked_item_id         '_identifier_set.id'
       _type.purpose     Key
       _type.container   Single
       _type.contents    Code
       _enumeration.default  . 
      save_


save_alias.xref_code
       _definition.id             '_alias.xref_code'
       _definition.update           2011-01-26
       _definition.class            Attribute
       _description.text
;
        Code identifying the dictionary containing the primary
        definition of the dictionary as given in the
        DICTIONARY_XREF category.

;
       _name.category_id            definition
       _name.object_id              xref_code
       _name.linked_item_id         '_dictionary_xref.code'
       _type.purpose                Key
       _type.container              Single
       _type.contents               Code
        save_


save_IDENTIFIER_SET

     _definition.id      identifier_set
     _definition.scope   Category
     _definition.class   List
     _definition.update  2011-01-27
;
      Data items used to describe the identifier_set identifiers
      used in this dictionary.  Data items in this category
      are NOT used directly as attributes of individual data items.
      See linked item _alias_identifier_set.identifier_set_id
      for such uses.

     
;
      _category.parent_id ddl_attr
      _category_key.generic  '_identifier_set.id'

      save_

save_identifier_set.id
       _definition.id   '_identifier_set.id'
       _definition.class  Attribute
       _definition.update 2011-01-27
       _description.text
;
       A code identifying an identifier_set of related tags.
       The coverage of an identfier_set may conform precisely
       to the set of tags in a particular dictionary,
       or to tags drawn from multiple dictionaries or
       to a subset of tags from a single dictionary.

       The same tag may belong to multiple identifier
       sets, and a given tag may not belong to any
       identifier set, in which case the only associated
       identifier set is a null value.

       To help ensure that dictionaries can be merged,
       each code should either begin with an IUCr-registered
       prefix or, if not prefixed, have been approved
       by COMCIFS.  The special prefix 'local_' may be
       use for purely internal purposes of an organization.
;
       _name.category_id identifier_set
       _name.object_id   code
       _type.purpose     Key
       _type.container   Single
       _type.contents    Code

      save_

save_identifier_set.description
       _definition.id   '_identifier_set.description'
       _definition.class  Attribute
       _definition.update 2011-01-27
       _description.text
;
       A description of the identifier_set
;
       _name.category_id identifier_set
       _name.object_id   code
       _type.purpose     Describe
       _type.container   Single
       _type.contents    Text


      save_

save_ALIAS_IDENTIFIER_SET

      _definition.id      alias_identifier_set
      _definition.scope   Category
      _definition.class   List
      _definition.update  2011-01-27
;
       The attributes used to specify the identifier_set of
       tags to which a given tag belong.

       A given tag may belong to multiple identifier_sets
       and may be cited against multiple dictionaries.

       Note that _alias_identifier_set.identifier_set_id is a
       component of the key of ALIAS_IDENTIFIER_SET.  If the
       denormalized join presentation is used to bring the object
       ids of this child category up into the parent
       ALIAS category, then _alias.identifier_set_id will
       we used as an implicit addition to the key of
       the denormalized ALIAS category.
      
       Until DDLm can be formally revised to automatically
       handle the necessary promotion of child catgeory keys
       in denormalized joins, a place-holder
       _alias.identifier_set_id has been defined in the
       ALIAS catgeory.

;
      _category.parent_id  alias
      _category.parent_join  Yes
      _category_key.primitive  ['_alias_identifier_set.identifier_set_id',
                                '_alias_identifier_set.definition_id',
                                '_alias_identifier_set.xref_code']
       save_

save_alias_identifier_set.definition_id
       _definition.id   '_alias_identifier_set.definition_id'
       _definition.class  Attribute
       _definition.update 2011-01-27
       _description.text
;
       Together with _alias_identifier_set.xref_code, identifies
       an alias belonging to an identifier_set.  An alias may
       belong to any number of identifier_sets, including zero.

;
       _name.category_id alias_identifier_set
       _name.object_id   definition_id
       _name.linked_item_id  '_alias.definition_id'
       _type.purpose     Key
       _type.container   Single
       _type.contents    Tag
        save_

save_alias_identifier_set.identifier_set_id
       _definition.id   '_alias_identifier_set.identifier_set_id'
       _definition.class  Attribute
       _definition.update 2011-01-27
       _description.text
;
       Identifies an identifier_set to which the alias
       identified by _alias_identifier_set.definition_id
       and _alias_identifier_set.xref_code ) belongs.

       A pointer to _identifier_set.id
;
       _name.category_id alias_identifier_set
       _name.object_id   code
       _name.linked_item_id  '_identifier_set.id'
       _type.purpose     Key
       _type.container   Single
       _type.contents    Code

      save_


save_alias_identifier_set.xref_code
       _definition.id   '_alias_identifier_set.xref_code'
       _definition.class  Attribute
       _definition.update 2011-01-21
       _description.text
;
       A code identifying the actual dictionary,
       virtual dictionary or other logical grouping
       to which the identifier tag belongs.
;
       _name.category_id alias_identifier_set
       _name.object_id   code
       _name.linked_item_id  '_dictionary_xref.code'
       _type.purpose     Key
       _type.container   Single
       _type.contents    Code
        save_



At 5:21 PM -0600 1/27/11, Bollinger, John C wrote:
>Hi John,
>
>On Thursday, January 27, 2011 1:27 PM, John Westbrook wrote:
>
>>I do not wish to complicate the discussion but I have a somewhat 
>>different perspective on
>>the the issue of normalization.
>
>By all means DO complicate the discussion if DDLm's relational model 
>and normalization options are inadequate for your needs.
>
>[...]
>
>>To better address this in software we have added DDL2 extensions to 
>>define parent/child linking groups -
>>
>>See -
>>
>>http://mmcif.pdb.org/dictionaries/mmcif_ddl.dic/Data/history.html
>>
>>categories - pdbx_item_link_group and pdbx_item_link_group_list
>>
>>The groups defined in these categories allow validation of common 
>>items between categories
>>with multiple connecting relationships.   For instance, tables of 
>>bonds, angles and torsions
>>have multiple independent collections of natural keys times the 
>>number of nomenclatures.
>>In some cases the validation must make independent comparisons of 
>>each group against
>  >the same group of parents items.
>
>To be sure I understand: those categories provide a means for 
>defining relationships involving candidate keys that are not 
>category keys?  That's an eminently reasonable thing to do.  Do you 
>use them in any broader sense?  For example, where neither end of 
>the link is a candidate key for its category?  Are you looking to 
>get something like this into DDLm, or are you satisfied to define it 
>at dictionary level?
>
>>I raise this issue because it is an unavoidable consequence of 
>>denormalization.  And,
>>as Herbert points out the denormalized organization is important in 
>>data harvesting
>>and generally maintaining a connection to laboratory practice.
>
>I think we're talking about two different levels of (de)normalization.
>
>I can see connections between *dictionary* denormalization and a 
>need or want to define relationships where neither end is a category 
>key.  I don't see the same connection with a denormalized 
>presentation of the *data*.  For a denormalized presentation to be 
>valid, it must be reducible to a normalized representation (to the 
>extent that the dictionary is normalized), so any validations you 
>can perform on a normalized presentation, you can also perform on a 
>valid denormalized presentation.  The possible failure to reduce to 
>a normalized presentation is precisely the additional validation 
>that I already pointed out was required of denormalized 
>presentations.
>
>Am I missing something here?
>
>>In the original design of DDLm their was an emphasis on adopting 
>>simple rather than
>>complex category keys.  This has been an issue of some concern for 
>>me as this does
>>not map well to our data which is rich in complex natural keys.
>
>Even for on-line transactional databases I have never been among 
>those who categorically shun natural keys and shudder at compound 
>ones.  (I have seen production schema where even join tables have 
>their own surrogate keys.  Ridiculous!)  For a human-readable and 
>essentially static medium such as CIF, natural keys are highly 
>appropriate, and there is no compelling reason to avoid compound 
>keys where there is no simple candidate key.  None of the arguments 
>for doing so in an OLTP schema apply.
>
>With that said, the current DDLm draft does provide for compound 
>category keys.  I would be open to extending it to provide for 
>explicit definition of candidate keys as well.  If we do so, then I 
>see no particular reason to not have a way to define relationships 
>involving a candidate key on at least one side.  Before I'll buy 
>into relationships any more generalized than that, however, I'd like 
>to understand the potential uses in more detail.
>
>Of course, no matter what facilities DDLm may provide, it is another 
>question entirely whether dictionary authors use them.
>
>
>Regards,
>
>John
>
>--
>John C. Bollinger, Ph.D.
>Department of Structural Biology
>St. Jude Children's Research Hospital
>
>
>
>Email Disclaimer:  www.stjude.org/emaildisclaimer
>
>_______________________________________________
>ddlm-group mailing list
>ddlm-group@iucr.org
>http://scripts.iucr.org/mailman/listinfo/ddlm-group


-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]