[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Thu, 20 Jan 2011 06:10:28 -0500 (EST)
- In-Reply-To: <alpine.BSF.2.00.1101200440460.66943@epsilon.pair.com>
- References: <AANLkTikZoEF_D+5-3+Eg4pbCx0cAu+SJvR-a_XkC3zK2@mail.gmail.com><alpine.BSF.2.00.1101190833560.91751@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ECE@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1101191042290.42382@epsilon.pair.com><4D371BE7.3050501@mcmaster.ca><alpine.BSF.2.00.1101191234221.42382@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ED0@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1101191632410.65107@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54166D7D1ED1@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1101191855500.30768@epsilon.pair.com><AANLkTi=xn2ntdNTvdTBKQQTsJhCQFbKcxceJ1C_u1oOf@mail.gmail.com><alpine.BSF.2.00.1101200440460.66943@epsilon.pair.com>
Dear Colleagues, There is an importantr part of James' suggestions that, if Brian is willing, I think it would be a good idea to add to the _alias.tag_style proposal and that is a central registry of styles to facilty dictionary merging. The ground rules would be: COMCIFS approval for any style, such as DDL1, DDL2, DDLm, etc., unless prefixed by a prefix from Brian's prefix registry, e.g. pdbx_. The special prefix local_ could be used for styles for use purely locally, i.e. for private dictionaries for which] collisions on merging are not a concern. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Thu, 20 Jan 2011, Herbert J. Bernstein wrote: > Dear Colleagues, > > If a DDLm dictionary is to be a fully functional replacement > for, say, a DDL1 dictionary, a dictionary against which one > can validate the use of purely DDL1 tags, we need a way to > not only specify the desired DDL1 tag as an alias to the > DDLm tag used in the dictionary, but also to specify that > we do _not_ want to accept the DDLm tag used as the save > frame name as the valud name. As David has noted, in > order not to still be maintaining both a DDL1 and a > DDLm dictionary, we want this information _in_ the DDLm > dictionary, so simply aliasing back to some other > DDL1 dictionary to use it as a way to say -- "use that dictionary > URI as the style indicator" is suboptimal. Worse, it is a > source of future errors and confusion in that it is > defining properties of the tag that may end up disagreeing > with the properties we wish to actually have that we > defined in the DDLm dictionary. > > OK, so far, so good -- all we need then is John B.'s tag-by-tag > style preference flag to say, for this dictionary we want to > be DDL1'ish. > > Ah, but now we say, we are in the situation of maintaining > the core (David's problem) in which we have to maintain > a dictionary for validation against both DDL1 and DDL2 > tag names. Now there are times when we wish the DDL1 > alias to be the preferred alias and for both the DDL2 > and DDLm tags to fail a validation check and other times when > we wish the DDL2 alias to be the prefeered alias and > for both the DDL1 and DDLm tags to fail a validation check. > Now it becomes simpler to just have a common style key, > such as "DDL1" or "DDL2" and to select just the way we > do for alternate conformers on that key. > > OK, that was not so bad, but now we are at, say, the PDB > and in addition to having DDL1 and DDL2 style tags from > the core, we also have prefixed tags (pdbx) that should > eventually get promoted to be prefix-free. Now we can > use the styles to validate for strict use of > the prefixes when we are producing output that we want > to be certain actually does use the prefixes, or > relax the validation to allow both the prefixed and promoted > tags, or go strict again on the far side to be sure be > are only producting promoted tags. > > Note that none of these style based input validation choices > are based on the choice of dictionary -- it is one dictionary, > so it does not really help to be maintaining the styles > dictionary by dictionary. The grain of identification is > too coarse, and involves multiple maintenance issues when > in reality only one, nice new, DDLm dictionary needs to be > maintained. > > On the output side, essentially the same issues arise, but > there are fewer users, but as I said, it is a harmless > addition to the DDLm spec for those who do not wish to > be aware of it, and for those of use for whom it is > useful, it really is useful. > > The fundamental diagreement is on whether we will have > to have a DDL1 dictionary, a DDL2 dictionary, a DDLm > dictionary, a prefix dictionary, etc., and plant them > on assorted web sites, or just one DDLm dictionary that > handles everything and can be local or remote or in > local and remote pieces without changing the behavior > of the validation or of the output. > > I hope that those who are uncomfortable with this change > will reconsider and support it. Thanks to David's clear > thinking it is a clean, simple and useful idea, much > better than my original import suggestion. > > Please support it. > > Regards, > Herbert > > ===================================================== > Herbert J. Bernstein, Professor of Computer Science > Dowling College, Kramer Science Center, KSC 121 > Idle Hour Blvd, Oakdale, NY, 11769 > > +1-631-244-3035 > yaya@dowling.edu > ===================================================== > > On Thu, 20 Jan 2011, James Hester wrote: > >> I'm trying to get a grip on what problem the tag_style proposal >> solves. I'll just emphasise at the outset in case there are any >> misconceptions that it is incorrect to suppose that the dREL method >> knows or needs to know anything about the particular syntax in which >> an input or output value is expressed; dREL is concerned purely with >> describing relationships. >> >> Here are the two scenarios that I think are being discussed under the >> rubric of DDLm compatibility with CIF1: >> >> Scenario 1: given a DDLm dictionary, a program wishes to generate and >> (validate/insert) the value for some given CIF1 dataname in a CIF1 >> datafile, using other CIF1 tags found in that datafile. We are all >> agreed (I think) that locating the relevant DDLm dictionary entries >> for a CIF1 dataname is a simple and well-defined task. The formatting >> of the eventual output value of the DDLm method is also not in the >> purvey of the dictionary, but rather of the application that is using >> the dictionary. The particular CIF1 tag to put in the datafile is >> also not an issue, as that was given at the beginning. So the >> tag_style proposal is not relevant here. >> >> Scenario 2: given a CIF2 datafile, a DDLm application wishes to >> produce an equivalent CIF1 datafile. For many of the CIF2 datanames >> found in the CIF2 datafile, there are multiple possible datanames >> listed as aliases. How is the application to ensure that it writes a >> set of datanames from DDL1 dictionaries only or DDL2 dictionaries >> only? The simple solution alluded to by John B would be to do as >> follows: for each dictionary URI mentioned in the alias list, use the >> IUCr CIF dictionary register (and/or other canonical sources) to >> determine the DDL version of that dictionary. DDL conformance is a >> standard entry in the dictionary register. The latest dictionary >> version as given in the dictionary register could be selected where >> multiple versions are presented (URL for the register is >> ftp://ftp.iucr.org/pub/cifdics/cifdic.register). >> >> Of course, any program wanting to do such conversions efficiently >> would pregenerate a DDL version - dictionary table once and refer to >> that. I therefore see no use, either in terms of efficiency or new >> functionality, for the tag_style attribute. >> >> Please advise if I have misunderstood the problem. >> >> James. >> On Thu, Jan 20, 2011 at 11:20 AM, Herbert J. Bernstein >> <yaya@bernstein-plus-sons.com> wrote: >>> No, a tag style is simply supposed to identify a grouping of alias >>> tag choices that belong together, so you can decide to put out >>> those particular versions of tags. It is just a text string, >>> just like a alternate conformer identifier. >>> >>> The same tag name could be marked with a many tag styles as >>> you choose. It is just text. But you could not give multiple >>> aliases for the same DDLm tag for the same tag style when allowing >>> DDLm missing value generation or you would not know which version to put >>> out, and for validation, there is no reason not to use different >>> styles for the different alternatives. >>> >>> The way I will write the extraction algorithm, if you choose >>> a tag style, you will get the DDLm name for the tags that don't >>> have an alias for the chosen style, but the tag alias given for the >>> specified style is there is one. That way a dictionary that is >>> intended to support DDL1, DDL2 and DDLm for which the DDLm >>> tags happen to be primarily consistent with DDL2 conventions, >>> then for the tags that conform to DDL2 conventions, you will >>> not need a DDL2 style alias, just a DDL1 style alias. You will >>> only need both a DDL1 style alias and a DDL2 style alias for >>> a tag for which the DDLm tag is different from both, e.g. >>> for _diffrn_standards_decay_% (DDL1), _diffrn_standards.decay_% >>> (DDL2) and _diffrn_standards_decay_percent (DDLm). When you >>> want DDLm output and validation, you don't specify a style at all. >>> >>> This will be very nice to allow an automatic cleanup for dictionaries >>> using a prefix, say pdbx, for tags that later get promoted to >>> to not need a prefix. >>> >>> Regards, >>> Herbert >>> >>> ===================================================== >>> Herbert J. Bernstein, Professor of Computer Science >>> Dowling College, Kramer Science Center, KSC 121 >>> Idle Hour Blvd, Oakdale, NY, 11769 >>> >>> +1-631-244-3035 >>> yaya@dowling.edu >>> ===================================================== >>> >>> On Wed, 19 Jan 2011, Bollinger, John C wrote: >>> >>>> On Wednesday, January 19, 2011 3:47 PM, Herbert J. Bernstein wrote: >>>>> The definition_id most certainly does not exhibit the tag >>>>> style. For example, there is no way to distinguish DDLm >>>>> tag style from DDL2 or DDL2 tag style from context. That >>>>> is intentionally inherent in the design of DDLm. >>>> >>>> Then I'm afraid I don't quite comprehend the meaning of "tag style". I >>>> would like to do, so that I can form a well-founded opinion about it. >>>> >>>> As I thought I had understood the idea, the tag style is proposed to >>>> identify the set of DDL conventions with which the given alias complies. >>>> If that were indeed what it was intended to mean, however, then (1) as >>>> you observe, some names would comply with more than one set of >>>> conventions, but also (2) a set of candidate tag styles, at least, could >>>> be generated could be computed for any alias name. >>>> >>>> What would be the significance of marking an alias that conforms with >>>> both DDL2 and DDLm conventions with tag style DDL2? >>>> >>>> Might it ever be needful or useful to mark the same alias with more than >>>> one tag style? >>>> >>>>> As for defining a hypothetical URI, that can break, >>>>> or each least time-out programs trying to get additional >>>>> information about an aliased tag from that URI. URIs >>>>> should be for things that really exist on the web, >>>>> not a substitute for a tag that really defines something >>>>> different, in this case the style of tags. >>>> >>>> I don't think the issue is nearly so clear cut. I would hold, for >>>> example, that the primary purpose of a URI is to *identify* a resource. >>>> That's what the "I" stands for, as I'm sure you're aware. RFC 3986 >>>> (Uniform Resource Identifier (URI): General Syntax) explicitly provides >>>> that a URI may identify an abstract resource. RFC 2396 (now obsoleted by >>>> 3986) says the same. Although many URIs fulfill their purpose by serving >>>> as resolvable web addresses, some, even among those formatted as URLs, do >>>> not. Examples of the latter abound in various XML communities. >>>> >>>> Personally, however, I think a bit more like you do: a URL ought to refer >>>> to a retrievable resource on the web. For an abstract or virtual >>>> resource, therefore, I prefer to use a URN. For something like your >>>> virtual DDL1 imgCIF dictionary, I might choose something like >>>> urn:x-imgCIF:DDL1. If a URN were used, then programs assuming a >>>> resolvable URL might still break, but only if they were poorly crafted >>>> indeed would they hang pending a time out. The whole issue could largely >>>> be mooted by clarifying the purpose and intended usage of >>>> _alias.dictionary_uri in its definition. That need not prevent programs >>>> from attempting to resolve dictionary URIs, but if it specified that >>>> dictionary URIs might be permanently unresolvable then programmers would >>>> know to prepare for that possibility. >>>> >>>>> We already do something very similar to this with >>>>> alternate conformers and with NMR model numbers. It >>>>> really is a simply concept for organizing information >>>>> that belongs in groups, in this case the group of >>>>> DDL1 or DDL2 or DDLm or ... style tags. >>>> >>>> I think that makes it a bit clearer to me what you want to do, but I'm >>>> still interested in the answers to my questions above. I'm a bit >>>> uncomfortable with defining generic groups of aliases with per-dictionary >>>> semantics, if that's indeed what you're proposing. For one thing, it >>>> does not play well with dictionary merging. For another, the meaning of >>>> the groupings is nowhere defined, at least not without adding at least >>>> one more data names to DDLm for that purpose. >>>> >>>> On the other hand, data names have at least one natural grouping: the >>>> dictionaries in which they are defined. This grouping is already modeled >>>> in DDLm, and as far as I can tell, it is conceptually a perfect fit for >>>> what you want to do. >>>> >>>> That doesn't necessarily mean that there is no use for a more general >>>> grouping mechanism. I am curious indeed whether there are use cases for >>>> grouping data names that do not align well with dictionaries or >>>> dictionary-defined attributes. Can anyone suggest some? >>>> >>>>> It solves >>>>> a very real problem for me with imgCIF. It does >>>>> not harm to anybody else. If nobody uses it in >>>>> another dictionary, it still would have been a useful >>>>> addition to DDLm. >>>> >>>> I very much want you to have a solution to your problem, and I have >>>> suggested one that still seems absolutely natural to me. It may be that >>>> there are better alternatives, and perhaps even that tag style would be >>>> one such. Of the latter, however, I am not yet persuaded. >>>> >>>> Perhaps "harm" is too charged a word, but adding an additional attribute >>>> to DDLm certainly does cost everyone else. Every DDLm application must >>>> support all the DDLm attributes, so every additional attribute places a >>>> development and maintenance burden on multiple developers. That >>>> incrementally slows software release cycles and introduces additional >>>> space for bugs and incompatibilities to hide. It's a small cost for most >>>> people, but everyone pays it. The proposed tag style is no different in >>>> that regard from any other DDLm attribute, of course, but that doesn't >>>> mean that its cost should be ignored. >>>> >>>> As for whether it would be a useful addition to DDLm, that is exactly >>>> what I am trying to decide. Potential use cases such as I solicited >>>> above would help me make that decision. >>>> >>>>> In the end, I suspect that both core and mmCIF DDLm >>>>> dictionaries will be built this way, because it >>>>> make it simpler and clearer and allows multi-purpose >>>>> dictionaries to be self-contained and avoid the >>>>> maintenance headache David spotted. >>>> >>>> If by "multi-purpose dictionaries" you mean defining multiple virtual >>>> dictionaries via a single DDLm dictionary, such as you plan, then I still >>>> see the dictionary_uri as the natural way to use aliases for that >>>> purpose. If there is a broader concept here then please help me see it. >>>> >>>> >>>> Regards, >>>> >>>> John >>>> >>>> -- >>>> John C. Bollinger, Ph.D. >>>> Department of Structural Biology >>>> St. Jude Children's Research Hospital >>>> >>>> >>>> >>>> >>>> Email Disclaimer: www.stjude.org/emaildisclaimer >>>> >>>> _______________________________________________ >>>> ddlm-group mailing list >>>> ddlm-group@iucr.org >>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>> >>> _______________________________________________ >>> ddlm-group mailing list >>> ddlm-group@iucr.org >>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>> >> >> >> >> -- >> T +61 (02) 9717 9907 >> F +61 (02) 9717 3145 >> M +61 (04) 0249 4148 >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group >
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (John Westbrook)
- References:
- Re: [ddlm-group] Objectives of CIF2 syntax discussion (James Hester)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (Bollinger, John C)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (David Brown)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. . (Bollinger, John C)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (Bollinger, John C)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (James Hester)
- Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. . (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- Next by Date: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- Prev by thread: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- Next by thread: Re: [ddlm-group] Objectives of CIF2 syntax discussion. .. .. .
- Index(es):