[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .... .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .... .
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Tue, 1 Feb 2011 17:59:09 -0500
- In-Reply-To: <8F77913624F7524AACD2A92EAF3BFA54166D7D1EF4@SJMEMXMBS11.stjude.sjcrh.local>
- References: <AANLkTi=ATdNovWFiecEwDrbtMdTwZ7guvYuBCGrdnb-i@mail.gmail.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1EDE@SJMEMXMBS11.stjude.sjcrh.local> <4D404DAA.8070804@mcmaster.ca> <a06240802c96600c48956@[192.168.2.102]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EE1@SJMEMXMBS11.stjude.sjcrh.local> <a06240800c9668e1faa7c@[192.168.2.102]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EE8@SJMEMXMBS11.stjude.sjcrh.local> <a06240802c9674292646e@[192.168.2.102]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EEB@SJMEMXMBS11.stjude.sjcrh.local> <4D41C6E7.2040109@rcsb.rutgers.edu><8F77913624F7524AACD2A92EAF3BFA54166D7D1EEF@SJMEMXMBS11.stjude.sjcrh.local> <a06240800c967b204830b@[192.168.2.102]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EF0@SJMEMXMBS11.stjude.sjcrh.local> <alpine.BSF.2.00.1101282147550.61818@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1EF1@SJMEMXMBS11.stjude.sjcrh.local> <a06240801c96cc655685e@[149.72.7.214]><8F77913624F7524AACD2A92EAF3BFA54166D7D1EF4@SJMEMXMBS11.stjude.sjcrh.local>
Bottom line (literally) >I see no reason why DDLm instance documents (i.e. dictionaries) >should have different presentation rules than the instance documents >they themselves describe. Given a valid, possibly-denormalized >instance document and a dictionary with which it complies, it must >be possible to programmatically normalize the instance to the form >described by the dictionary (else the document contains >inconsistencies and therefore is invalid). DDLm dictionaries are >instance documents of DDLm, so there is no need for different >behavior with respect to them. > >Although I think the same applies to DDLm's own presentation, I am >concerned about what would happen if DDLm were presented in a >denormalized form that contained inconsistencies. Rather than >expend continuing effort to ensure that a denormalized presentation >of DDLm remains consistent, I would rather expend effort to express >and maintain DDLm in its self-defined normalized form. In any case, >I emphasize again that allowing a denormalized presentation is not >at all the same thing as defining a denormalized model. > >None of the foregoing settles just what presentation rules DDLm >should actually require with respect to joined categories. Should >denormalizing joins be permitted? There is a cost/benefit analysis >to be performed here, but I'm not up to attempting it at the moment. Which seems to leave us entirely with a matter of taste: does anybody want to have a denormalized version of alias and its subcategories? I am happy to do it either way. I just need to use the sets. If nobody else speaks up, I'll just make a guess and start programming on that basis. We can then look at the result and figure out whether to use it or redo it in Madrid. Herbert At 4:22 PM -0600 2/1/11, Bollinger, John C wrote: >Dear Herbert, > >On Monday, January 31, 2011 3:09 PM, Herbert J. Bernstein wrote: > >>At 1:20 PM -0600 1/31/11, Bollinger, John C wrote: > >[...] > >>This discussion began with adding what we were then calling styles >>to group related sets of tags. One tag could have multiple styles. >>In normalized form, that would mean creating a new relation with >>the tags and the styles as components of a composite key, so the >>say key could be repeated with multiple styles and the same >>style could be repeated with multiple keys. > >Indeed so. This is what the ALIAS_DEFINITION_SET category provides >(by whichever name it's now going). > >>Placing that directly in the alias category instead of >>in a separate relation _is_ a denormalization. > >In a formal sense, I think you're saying that the result would not >satisfy second normal form because _alias.dictionary_uri would >depend on only part of the key (_alias.definition_id). I agree. >That does rely on _alias.dictionary_uri not being part of a >candidate key, but the current definition assumes that. > >If the only attributes were _alias.definition_id and >_alias.definition_set_id, however, and both were elements of the >key, then the category would comply even with domain-key normal >form. One might in that case complain that the meaning of the ALIAS >category was changed, and that would be true, but it would be as >normalized as can be. > >> You happen to >>have preferred to use the xref_code, but adding that to the >>alias category key is and was a denormalization. In CIF, until >>now at least, COMCIFS has tried to maintain a global name space, >>with a given tag having one meaning across multiple dictionaries. >>That is why there is a prefix registration system, so adding >>the dictionary to the alias key should not be necessary. > >So this is exactly one of the conversations I said we needed to >have: "What is the entity being modeled, and what assumptions are >being made about it? [... T]his question could be framed as 'Should >a dictionary identifier be added to the ALIAS category key?'" Thank >you for indulging me. > >Xref_code, or some other dictionary identifier, is a different case >than definition_set_id. Whereas there is no viable argument for >definition_set_id being part of a candidate key for ALIAS as that >category is currently defined, there *are* arguments for xref_code >being part of a candidate key. We can choose how we want to model >things, but the decision is not arbitrary: it has technical, >semantic, and policy implications. > >>From a technical perspective, the question can be again reframed as >>"does a definition_id determine the dictionary in which its >>definition appears?" Inasmuch as the definition does not presently >>include dictionary_uri in the category key, DDLm as currently >>constituted appears to say "yes." I think that's erroneous. At >>minimum, COMCIFs' intention seems to be to redefine many mmCIF data >>names in a DDLm dictionary, and Herbert has expressed plans to do >>similarly for imgCIF. Herbert nevertheless offers a contrasting >>view: > >>The idea in CIF is that you _don't_ use the same tag name with >>different meanings in different dictionaries, but with the introduction >>of DDL2 and mmCIF we ended up with 2 versions of the same core definitions >>having the same meanings but different tag names. Thus we needed to >>have aliases to relate the DDL2 dotted notation versions of the >>tags to the DDL1 undotted notations of the tags. > >I understand the original impetus for aliases. Interpreting DDL2, >however, I conclude that the concept was broadened during >development, and that the assumption of data names having global >scope was intentionally avoided. Others here were closer to the >process than I, but I observe that the description of the DDL2 >ITEM_ALIASES category specifically says "Each alias name is >*identified by* the name and version of the dictionary to which it >belongs" (emphasis added). Indeed, the category key is >(_item_aliases.alias_name, _item_aliases.dictionary, >_item_aliases.version). That's even broader than anything currently >under discussion for DDLm. ITG remarks that >"_item_aliases.dictionary [... is] provided to distinguish between >dictionaries [...]," which would not be necessary if a given data >name could be assumed to be defined in only one dictionary, or even >to be defined equivalently in every dictionary where it appears. > >As much as the idea may be to globally avoid data name clashes, it >is not necessary to assume that they are successfully avoided. >Rejecting that assumption not only protects against failures and >policy changes in the CIF community, but it also makes DDLm a better >candidate for adoption in disciplines with less central authority. >Furthermore, although we do not need to follow DDL2 here, it does >establish a precedent for scoping aliases to specific dictionaries. >These are all good reasons to choose that, for DDLm's purposes, >definition_id PLUS some form of dictionary identifier are required >to uniquely identify an alias definition. Are there good reasons to >choose otherwise? > >Supposing that we do adopt the view that unique identification of >definitions requires at least definition_id and a dictionary >identifier, ALIAS is not even a proper relation unless a dictionary >identifier (such as xref_code) is added to the category key. > >[...] > >>I would be very happy having fully normalized DDLm dictionaries, but >>I can cope with denormalized dictionaries, just as I have to cope >>with denormalized datafiles -- indeed, for some search procedures, >>I deliberately denormalize dictionaries internally. It >>sounds like John B. wants to stick to fully normalized DDLm dictionaries. > >Hmm. I would be happy to see dictionaries define data models that >comply with higher normalization forms, but that is a design >decision that should rest with their authors and maintainers. I >would in particular like DDLm itself to describe a highly normalized >model for its own domain (dictionaries), though exactly which form >would be most appropriate is an open question. Ensuring that DDLm >describes a well-normalized data model does not force other DDLm >dictionaries to describe equally normalized models. *Presentation* >of these models, on the other hand, remains a separate issue, >discussed next. > >>While this has some impact on software developers, it has very little >>direct impact on users -- so what do people think: >> >> Should all DDLm dictionaries be fully normalized (if so, to which level >>of normalization) or >> >> Should DDLm dictionaries bee allowed the same flexibility as >>data files in being denormalized? > >I see no reason why DDLm instance documents (i.e. dictionaries) >should have different presentation rules than the instance documents >they themselves describe. Given a valid, possibly-denormalized >instance document and a dictionary with which it complies, it must >be possible to programmatically normalize the instance to the form >described by the dictionary (else the document contains >inconsistencies and therefore is invalid). DDLm dictionaries are >instance documents of DDLm, so there is no need for different >behavior with respect to them. > >Although I think the same applies to DDLm's own presentation, I am >concerned about what would happen if DDLm were presented in a >denormalized form that contained inconsistencies. Rather than >expend continuing effort to ensure that a denormalized presentation >of DDLm remains consistent, I would rather expend effort to express >and maintain DDLm in its self-defined normalized form. In any case, >I emphasize again that allowing a denormalized presentation is not >at all the same thing as defining a denormalized model. > >None of the foregoing settles just what presentation rules DDLm >should actually require with respect to joined categories. Should >denormalizing joins be permitted? There is a cost/benefit analysis >to be performed here, but I'm not up to attempting it at the moment. > > >John > >-- >John C. Bollinger, Ph.D. >Department of Structural Biology >St. Jude Children's Research Hospital > > >Email Disclaimer: www.stjude.org/emaildisclaimer > >_______________________________________________ >ddlm-group mailing list >ddlm-group@iucr.org >http://scripts.iucr.org/mailman/listinfo/ddlm-group -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- References:
- Re: [ddlm-group] DDLm aliases (subject changed) (James Hester)
- Re: [ddlm-group] DDLm aliases (subject changed). . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). . (David Brown)
- Re: [ddlm-group] DDLm aliases (subject changed). . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. . (John Westbrook)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .... . (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .
- Next by Date: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .
- Prev by thread: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .... .
- Next by thread: Re: [ddlm-group] DDLm aliases (subject changed). .. .. .. .. .. .... .
- Index(es):