[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] DDLm aliases (subject changed). .. .. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Thu, 27 Jan 2011 11:16:15 -0500
- In-Reply-To: <8F77913624F7524AACD2A92EAF3BFA54166D7D1EE8@SJMEMXMBS11.stjude.sjcrh.local>
- References: <AANLkTi=ATdNovWFiecEwDrbtMdTwZ7guvYuBCGrdnb-i@mail.gmail.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1EDE@SJMEMXMBS11.stjude.sjcrh.local> <4D404DAA.8070804@mcmaster.ca> <a06240802c96600c48956@[192.168.2.102]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EE1@SJMEMXMBS11.stjude.sjcrh.local> <a06240800c9668e1faa7c@[192.168.2.102]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EE8@SJMEMXMBS11.stjude.sjcrh.local>
Let me just talk about the category join issue. The current documentation is vague about the issue of how one should match up the keys, and John B.'s interpretation may well be what was intended, but I think for the join the actually be useful, it has to be extended to cover the normalization and denormalization cases, in which the choice of keys depends on the degree of normalization. This actually gets back to an old disagreement between CCP4 and the PDB, which could finally be resolved with a liberal (i.e. denormalization-friendly) interpretation of category join. When you normalize a category, you often strip out several columns that were originally key components in the larger category, and put them entirely in the child category, so there is less repetition in the parent category. If we are to allow the option of using the dictionary with the normalized categories with fewer key components to be presented as the original wider, flatter denormalized categories, then we need to interpret the _category.parent_join in a way that permits more key components in the denormalized presentation, e.g.: If a tag in the child category has a linked tag in the parent category (either directly or by having a common directly or indirectly linked tag), then the tag from the parent category must be used in the joined presentation. If a tag in the child category has no linked tag in the parent category, then an attempt will be made to construct a tag using dotted notation combining the parent category name with the child category object name. If that composite name does not conflict with an existing tag, then that composite name will be used in the joined presentation. If there is a conflict, the child category tag name will be used in the joined presentation. If a tag from the child category is a member of the key of the child category and a joined presentation includes that item, then it will automatically be added to the key of the joined presentation. This interpretation of the join semantics would allow flexible use of normalized and denormalized presentations of data without having to clutter parent categories with definitions from child categories that are not needed, and indeed cannot be used, in the normalized presentation, allowing greatly simplified flat tables for such things as data harvest, but clean, normalized tables for database loads. I'll send an updated draft reflecting John's other comments in the next message, but the issue of allowing denormalizing joins is a separable discussion. Regards, Herbert At 9:34 AM -0600 1/27/11, Bollinger, John C wrote: >On Wednesday, January 26, 2011 9:50 PM, Herbert J. Bernstein wrote: > >>So, to pull it all together, see below. Please review and see >>what I have missed, mistyped or failed to convert from some >>earlier incarnation. Comments, corrections and suggestions >>greatly appreciated. >> >>I have not yet included the type change for _dictionary_xref.format >>because I am not sure a single word code would be sufficient >>to describe the format of any given dictionary, so for the moment >>it is still Text. > >Herbert's latest version looks good to me. See below for comments >and tentative corrections: > >[...] > >>save_ALIAS_ENSEMBLE >> >> _definition.id alias_ensemble >> _definition.scope Category >[...] >> _category.parent_id alias >> _category.parent_join Yes >> _category_key.primitive ['_alias_ensemble.ensemble_id', >> '_alias_ensemble.definition_id', >> '_alias_ensemble.xref_code'] >> save_ > > >As I understand the use of _category.parent_join, I think its value >for this category needs to be 'No', because the parent category has >a different (narrower) key structure. > > >>save_alias_ensemble.definition_id >> _definition.id '_alias_ensemble.definition_id' >> _definition.class Attribute >> _definition.update 2011-01-21 >> _description.text >>; >> Identifier tag of a definition associated with >> an xref code by which to group this tag with >> other tags. >> >> A given tag may belong to multiple ensembles >> and may be cited against multiple dictionaries. >> >> Note that the tag does not have to be a valid >> tag under DDLm tag construction rules, but >> it should be a valid tag under the rules of >> some DDL. >>; > >I would prefer to describe this a bit differently: >; > Together with _alias_ensemble.xref_code, identifies > an alias belonging to an ensemble. An alias may > belong to any number of ensembles, including zero. >; > >I omit the bit about tag construction rules, as no DDL yet proposed >defines any such rules; allowable tags are defined by CIF. As James >earlier observed, DDLm can define any tag allowed by CIF, even if >that name is not in the subset addressable by dREL. Similarly, DDL1 >and DDL2 can both define any data name allowed by CIF1, which >collectively are a subset of those allowed by CIF2. See also below. > >> _name.category_id alias_ensemble >> _name.object_id definition_id> _name.linked_item_id >>'_alias.definition_id' >> _type.purpose Key >> _type.container Single >> _type.contents Code >> save_ > >Shouldn't this item's _type.contents be 'Tag' to agree with the >linked item's? Alternatively, if 'Tag' signifies something more >specific than "data name allowed by CIF2" then perhaps >_alias.definition_id needs to be changed instead. I presume that >these questions are related to the comments about DDL tag >construction rules in this item's proposed description. > > >>save_alias_ensemble.ensemble_id >> _definition.id '_alias_ensemble.ensemble_id' >> _definition.class Attribute >> _definition.update 2011-01-26 >> _description.text >>; >> A code identifying an ensemble of related tags. >> To help ensure that dictionaries can be merged, >> each code should either begin with an IUCr-registered >> prefix or if not prefixed, have been approved >> by COMCIFS. The special prefix 'local_' may be >> use for purely internal purposes of an organization. >>; >[...] > >Is it needful or appropriate to repeat the definition text of the >linked item here? As long as we do adopt the ENSEMBLE category, the >importance of _alias_ensemble.ensemble_id is primarily that it >associates an alias with one of the ensembles defined elsewhere in >the dictionary. I suggest this alternative description text: > >; > Identifies an ensemble to which the alias identified by ( >_alias_ensemble.definition_id, _alias_ensemble.xref_code ) belongs. >; > > >Regards, > >John > >-- >John C. Bollinger, Ph.D. >Department of Structural Biology >St. Jude Children's Research Hospital > > > > > >Email Disclaimer: www.stjude.org/emaildisclaimer > >_______________________________________________ >ddlm-group mailing list >ddlm-group@iucr.org >http://scripts.iucr.org/mailman/listinfo/ddlm-group -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. . (Bollinger, John C)
- References:
- Re: [ddlm-group] DDLm aliases (subject changed) (James Hester)
- Re: [ddlm-group] DDLm aliases (subject changed). . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). . (David Brown)
- Re: [ddlm-group] DDLm aliases (subject changed). . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. . (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .
- Next by Date: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .
- Prev by thread: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .
- Next by thread: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .
- Index(es):