[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Technical issues with Proposal P. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Technical issues with Proposal P. .
- From: James Hester <jamesrhester@gmail.com>
- Date: Wed, 23 Feb 2011 10:23:55 +1100
- In-Reply-To: <alpine.BSF.2.00.1102221741460.23065@epsilon.pair.com>
- References: <AANLkTi=kadbHikjabDyioDOw=L_pthGORgi6w2b45yX6@mail.gmail.com><alpine.BSF.2.00.1102220644270.84613@epsilon.pair.com><417719.45449.qm@web87006.mail.ird.yahoo.com><alpine.BSF.2.00.1102220741480.84613@epsilon.pair.com><301639.7573.qm@web87001.mail.ird.yahoo.com><alpine.BSF.2.00.1102220845481.84613@epsilon.pair.com><8F77913624F7524AACD2A92EAF3BFA54168ECD35AD@SJMEMXMBS11.stjude.sjcrh.local><AANLkTimC63ietzrW0XV7kv+Fg9MDg+G6FjtLF0PFzOKd@mail.gmail.com><alpine.BSF.2.00.1102221741460.23065@epsilon.pair.com>
Hi Herbert: You posit an additional requirement, that the internal representation of any string may not contain artefacts of the syntactical representation ("with a meaning depending on the type given in the dictionary"). Given this additional requirement, we can confidently say that my first example string finishes with an accented e and that the second string contains an accented o followed by two double quotes. In line with this requirement, if in fact I wanted to finish the first string with a double quote, then I am forbidden to use a double-quoted raw string. Likewise, if there are any triple double quotes internally I cannot use a double-quoted raw string. If there are both triple quotes and triple double quotes in my string, I cannot use raw strings at all for my text and either have to double up all my backslashes in 'cooked' strings, or revert to <semicolon><eol> digraphs. If my string contains <semicolon><eol> digraphs, then my only choice is to use the "cooked" strings of the Python proposal. This additional requirement would have to be added to Proposal P, and everybody would just have to hope that CIF programmers are all sufficiently on the ball to detect any problem strings - or more likely they will simplify the code and just "cook" everything, making raw strings rather pointless. I frankly cannot understand why anyone would think that such a fragile scheme is superior to the spare elegance of Proposals F and F' (particularly F'), but at least we have a resolution of this particular technical issue. On Wed, Feb 23, 2011 at 9:49 AM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote: > Dear James, > > I don't understand the question. In its internal representation > the string is what it is. What is the ambuguity? If a > string is presented to an application by an API and it contains > \", then it contains the two characters backslash and double quote > with a meaning that depends on the type specified in the dictionary. > The are no delimiters in the internal representation, so a double > quote is as good or bad a character as any other. How is it a delimiter > internally? Is there some rule that we are not supposed to have strings > whose internal representation contains a delimiter? > > Regards, > Herbert > > ===================================================== > Herbert J. Bernstein, Professor of Computer Science > Dowling College, Kramer Science Center, KSC 121 > Idle Hour Blvd, Oakdale, NY, 11769 > > +1-631-244-3035 > yaya@dowling.edu > ===================================================== > > On Wed, 23 Feb 2011, James Hester wrote: > >> I would point out that nobody has yet addressed, let alone answered my >> question. I am *not* confused about going from syntax to internal >> representation, as it appears Simon briefly was. I am concerned about >> how a CIF application will disambiguate the character sequence >> <backslash><delimiter> *in the internal representation*. >> >> I am however glad that we all seem to agree that the particular >> delimiters used to express data values should not be significant >> beyond the parser. >> >> On Wed, Feb 23, 2011 at 3:32 AM, Bollinger, John C >> <John.Bollinger@stjude.org> wrote: >>> >>> On Tuesday, February 22, 2011 7:51 AM, Herbert J. Bernstein wrote: >>> >>>> From the point of view of writing a pure "CIF2" application that is >>>> not aware of the whitespace, particular quote marks, comments, etc, those >>>> two string are identical. >>>> >>>> From the point of view of a more general CIF API, in which comments, >>>> magic numbers, and partiular quote marks, those two string are different in >>>> precisely the same way that the string 'ABC' and "ABC" are different, and >>>> 13.4 and >>>> 1.34e1 are different. >>>> >>>> This is _not_ an ambiguity. It is a matter of whether we are looking >>>> for the information in a file or looking for the representations of the data >>>> in the file. >>> >>> Herbert is right about this. It doesn't matter which syntactic variant >>> was used to express a data value in an input CIF. Once the value is parsed, >>> the result is the value. In particular, under proposal P, """C\"""" >>> expresses a different value than does r"""C\"""", whereas """C\\\"""" and >>> r"""C\"""" express the same value. The fact that the character sequence C" >>> cannot be expressed via Python raw string format is irrelevant. An >>> application receiving these values does not need to know and should not care >>> in which form the value was expressed in a CIF, if indeed it was ever >>> expressed in CIF format at all. >>> >>> However, although there is no technical issue here, the fact that an >>> experienced and successful Python and CIF practitioner such as James raised >>> the question is illuminating. It demonstrates that the complexity of the >>> syntax and semantics provided by proposal P would be likely to be a source >>> of confusion for developers and users both. >>> >>> >>> Regards, >>> >>> John >>> >>> -- >>> John C. Bollinger, Ph.D. >>> Department of Structural Biology >>> St. Jude Children's Research Hospital >>> >>> >>> >>> >>> Email Disclaimer: www.stjude.org/emaildisclaimer >>> _______________________________________________ >>> ddlm-group mailing list >>> ddlm-group@iucr.org >>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>> >> >> >> >> -- >> T +61 (02) 9717 9907 >> F +61 (02) 9717 3145 >> M +61 (04) 0249 4148 >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- [ddlm-group] Technical issues with Proposal P (James Hester)
- Re: [ddlm-group] Technical issues with Proposal P (Herbert J. Bernstein)
- Re: [ddlm-group] Technical issues with Proposal P (SIMON WESTRIP)
- Re: [ddlm-group] Technical issues with Proposal P (Herbert J. Bernstein)
- Re: [ddlm-group] Technical issues with Proposal P (SIMON WESTRIP)
- Re: [ddlm-group] Technical issues with Proposal P (Herbert J. Bernstein)
- Re: [ddlm-group] Technical issues with Proposal P. . (Bollinger, John C)
- Re: [ddlm-group] Technical issues with Proposal P. . (James Hester)
- Re: [ddlm-group] Technical issues with Proposal P. . (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] Technical issues with Proposal P
- Next by Date: Re: [ddlm-group] Technical issues with Proposal P
- Prev by thread: Re: [ddlm-group] Technical issues with Proposal P. .
- Next by thread: [ddlm-group] Vote on moving elide discussion to COMCIFS
- Index(es):