[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Simon's elide proposal
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] Simon's elide proposal
- From: James Hester <[email protected]>
- Date: Wed, 12 Jan 2011 23:55:35 +1100
- In-Reply-To: <[email protected]>
- References: <[email protected]><[email protected]>
In triple-quoted strings there is no need to create \" or \' elides. It is sufficient to simply break up any embedded triple quotes. This is the insight behind proposals C and D and Simon's proposal. Simon's proposal is therefore complete. If you don't believe me, please present me with a string that you think is not handled by this proposal and I'll undertake to present you with the elided version. On Wed, Jan 12, 2011 at 9:39 PM, Herbert J. Bernstein <[email protected]> wrote: > Actually, Simon's proposal, while useful, is not complete, > inasmuch as \" and \' are not handled yet. �I urge adoption > of my compromise suggestion as written. � Without it, > we are going down the same slippery slope we crashed on > the last time we tried to resolve this issue. -- Herbert > > ===================================================== > �Herbert J. Bernstein, Professor of Computer Science > � Dowling College, Kramer Science Center, KSC 121 > � � � �Idle Hour Blvd, Oakdale, NY, 11769 > > � � � � � � � � +1-631-244-3035 > � � � � � � � � [email protected] > ===================================================== > > On Wed, 12 Jan 2011, James Hester wrote: > >> Note that Simon's proposal *does* completely answer Ralf's concern >> about the lack of elide mechanism in triple quoted strings. �It >> provides line folding as well. �I for one would consider our job >> finished if we were to adopt Simon's proposal, and see no need for the >> further steps proposed by Herbert. �Herbert is of course welcome to >> propose including the various Python behaviours as a separate >> amendment to the CIF2 standard. >> >> I would propose a slight tweak to Simon's proposal, so that it works as >> follows: >> >> The datavalue is obtained from the triple-quoted string in two steps: >> (1) All instances of <backslash><eol> are removed from the string >> where the <backslash> is not preceded by another <backslash> >> (2) All other instances of <backslash><eol> are replaced with <eol> >> >> This means that a sequence of n backslashes followed by newline is >> replaced by a sequence of n-1 backslashes followed by newline, except >> if there is one backslash before the newline, in which case both >> newline and backslash are removed. �Triple quote sequences are elided >> by inserting a <backslash><eol> sequence between <delimiter> >> characters to break up the triple delimiter sequence. �Note also that >> backslash has no special meaning if not in a sequence finishing with >> <eol>. >> >> I will be posting a separate email, hopefully tonight, where I will >> list the current elide proposals and request that we all indicate >> which ones are potentially acceptable to us, with a ranking if >> possible. �This may help us to restrict discussion to something that >> is mutually acceptable. >> >> On Sun, Jan 9, 2011 at 2:45 AM, Herbert J. Bernstein >> <[email protected]> wrote: >>> >>> Here is a possible compromise. �This thread began with >>> Ralf's concern about the lack of an elide mechanism >>> in treble quoted strings. �Simon's suggestion does >>> not really answer that question, but it is a reasonable >>> step in that direction. �So, how about ... >>> >>> 1. �Immediately adopt Simon's suggestion to allow the >>> \\n and \\ elides in treble quoted strings. �Except for >>> the confusion in the meaning of \"""" if a more general >>> elide is eventually adopted that should cause very little >>> stress for anybody. >>> >>> 2. �Add Ralf's proosed new section 7 to the CIF2 >>> document as a proposal under discussion, with the >>> advice that people may wish to avoid creating treble-quoted >>> string that conflict with the full python elide >>> conventions. >>> >>> 3. �Provide a coherent discussion document for COMCIFS >>> and the community at large on the alternatives in >>> handling the treble-quoted string, asking for comments >>> to the list prior to the Madrid meeting. �I would suggest >>> that Ralf be asked to contribute a page or 2 on the >>> merits of his proposal and that either John B. or James >>> contribute a page or 2 on their objections and alternatives. >>> >>> 4. �Discuss it face to face at the Msdrid meeting and >>> try to come to a resolution. >>> >>> 5. �Move forward with the rest of CIF2 as proposed in >>> the meantime so we will be ready to discuss all of CIF2 >>> at the Madrid meeting, with a effort to have sample parsers >>> and data sets available on the web prior to the meeting. >>> >>> Regards, >>> �Herbert >>> >>> ===================================================== >>> �Herbert J. Bernstein, Professor of Computer Science >>> � Dowling College, Kramer Science Center, KSC 121 >>> � � � �Idle Hour Blvd, Oakdale, NY, 11769 >>> >>> � � � � � � � � +1-631-244-3035 >>> � � � � � � � � [email protected] >>> ===================================================== >>> >>> On Sat, 8 Jan 2011, Herbert J. Bernstein wrote: >>> >>>> Dear James, >>>> >>>> �You are clearly a much better programmer than I am. When I got down >>>> into >>>> the interactions among the treble quote, single quotes, text fields, >>>> elides, >>>> the bracketed constructs and comments in the lexical scan, I found the >>>> going >>>> tough. �If you have it done neatly, I would greatly appreciate seeing >>>> it. >>>> >>>> �I think we need a face to face meeting or Skype meeting to resolve not >>>> just this one issue, but the process of getting a workable CIF2. >>>> �Perhaps we >>>> can finally get to do that in Madrid. >>>> >>>> �Regards, >>>> � Herbert >>>> >>>> >>>> ===================================================== >>>> Herbert J. Bernstein, Professor of Computer Science >>>> �Dowling College, Kramer Science Center, KSC 121 >>>> � � � Idle Hour Blvd, Oakdale, NY, 11769 >>>> >>>> � � � � � � � �+1-631-244-3035 >>>> � � � � � � � �[email protected] >>>> ===================================================== >>>> >>>> On Sat, 8 Jan 2011, James Hester wrote: >>>> >>>>> I can't let these assertions go unchallenged: >>>>> >>>>> On Sat, Jan 8, 2011 at 12:04 PM, Herbert J. Bernstein >>>>> <[email protected]> wrote: >>>>>> >>>>>> Dear Simon, >>>>>> >>>>>> � Adoption of Ralf's proposal will ... >>>>>> >>>>>> � 1. �Make it much easier to create a CIF2 parser, because for one of >>>>>> the messiest parts of the code we will have a clear specification, >>>>>> sample code and a way to validate the tough cases. >>>>> >>>>> If we adopt a simpler spec than the Python in toto spec: >>>>> - there will be many fewer tough cases >>>>> - there will be a simpler and therefore clearer specification >>>>> - for many alternative schemes the lexer will be unchanged from the >>>>> current version, with the elide behaviour >>>>> �simply requiring a search and replace following lexing >>>>> Triple-quoted string handling is not currently a messy part of the >>>>> code, I don't understand why you think this. �It will become >>>>> significantly more complex under Ralf's proposal. >>>>> >>>>>> � 2. �Make it easier for users to conform the the quoting rules, >>>>>> because >>>>>> at least that one part of CIF2 will be thoroughly documented with lots >>>>>> of examples. >>>>> >>>>> Quoting rules are not rocket science. �About 3 examples will be >>>>> enough, if we adopt a simple specification rather >>>>> than the unicode+raw+lots of escapes that the Python proposal entails. >>>>> Doing things the Python way would >>>>> imply more chance for user misunderstanding, especially bearing in >>>>> mind that CIF2 users are not necessarily >>>>> Python programmers or even programmers at all. �For these users, there >>>>> is absolutely no benefit in adopting Python or any other language's >>>>> approach - they are unfamiliar with them all. >>>>> >>>>>> � 3. �Make is easier for the journals and archives to deal with "odd" >>>>>> CIF2 files containing complex treble quoted strings because at >>>>>> least �that one part of CIF2 will be throughly documented with lots >>>>>> of examples, and, with a utility (IDLE) all ready to allow them >>>>>> to pull out a troublesome treble-quoted string and figure out what >>>>>> it means or what it might mean if some intuitive change were made. >>>>> >>>>> The simpler the spec, the less likely mistakes will be made and the >>>>> less chance of ambiguity. >>>>> >>>>>> � Yes, if Ralf's proposal happens to be rejected, we will still have >>>>>> a problem in the lack of elide handling, and yes we will have to >>>>>> put in the time an effort to consider those alternatives, but, first, >>>>>> in order to have some chance of finishing the specification of CIF2 >>>>>> before the summer meeting deadlines (at least one of which is in >>>>>> just a little more than 3 weeks), might it not be a good idea >>>>>> to discuss and consider what was actually proposed instead of >>>>>> chasing after lots of plausible alternatives that we already discussed >>>>>> and rejected, and so are not very likely to agree upon rapidly now. >>>>> >>>>> I have some hope that, by restricting our discussion to treble-quoted >>>>> strings, we can make progress compared to previous attempts. �I have >>>>> considered and discussed at length Ralf's proposal, and would be >>>>> interested in your responses to my particular objections. >>>>> >>>>>> � So, before I will delve into the many subtle variations of elide >>>>>> mechanisms, I would appreciate our finishing consideration of Ralf's >>>>>> actual proposal: >>>>>> >>>>>> ======================= >>>>>> >>>>>> His revised wording (with one correction) is: >>>>>> >>>>>> ======================== >>>>>> >>>>>> CHANGE 7 NEW >>>>>> >>>>>> >>>>>> Triple-quote delimited strings. >>>>>> >>>>>> The following ASCII sequences delimit the beginning of a string: >>>>>> >>>>>> � � """ >>>>>> � � ''' >>>>>> � � r""" >>>>>> � � r''' >>>>>> � � u""" >>>>>> � � u''' >>>>>> >>>>>> The characters following the delimiter sequence are interpreted >>>>>> with exactly the same algorithm as implemented for triple-quoted >>>>>> strings in the Python programming language version 2 series. >>>>>> In this algorithm, triple-quoted strings are terminated by matching >>>>>> """ or ''' delimiters. >>>>>> >>>>>> For example >>>>>> >>>>>> � � """He said "His name is O'Hearly".""" >>>>>> � � r'''In {\bf \TeX} the accents are \' and \".''' >>>>>> >>>>>> Triple-quoted strings provide a reliable mechanism for storing any >>>>>> arbitrary string in a CIF2 file. >>>>>> >>>>>> ========================= >>>>>> >>>>>> This is cleaner and simpler than the original change 7 wording. >>>>>> It probably does not conflict with existing CIF1 documents and the >>>>>> _only_ CIF2 documents it can conflict with are the very few >>>>>> that happen to end in \""" or \''''. �The new leading delimiters >>>>>> r""", r''', u""" and u''' will have to be added to the list of >>>>>> forbidden >>>>>> starts to white-space delimited data values in change 5. �In exchange >>>>>> for >>>>>> this minor adjustments to valid CIF2 syntax we gain a fully >>>>>> documented, >>>>>> software supported way to include arbitrary strings in a CIF2 document >>>>>> that people are already used to working with. >>>>>> >>>>>> I have reviewed the discussion of the "use of elides in strings" >>>>>> thread in the ddlm-group discussion list, and, while I did not >>>>>> then and do not now understand the objections to the general use >>>>>> of elides in quoted strings, I particularly do not understand >>>>>> the logic of objecting to the use of elides in treble-quoted strings, >>>>>> which are a construct completely new to CIF and therefore in >>>>>> conflict with no existing data files. >>>>>> >>>>>> Would those who have an objection to Ralf's proposal please >>>>>> state their objections. �An objection that says we object because >>>>>> in past discussions another body could not manage to come to an >>>>>> agreement and just gave up does not speak to the merits of this >>>>>> specific proposal. >>>>>> >>>>>> I have no idea why we are considering other proposals before >>>>>> settling the status of Ralf's proposal. >>>>> >>>>> It is also useful to know what the alternatives might be when >>>>> considering a proposal. >>>>> >>>>>> I agree with Ralf's proposal. >>>>>> >>>>>> Regards, >>>>>> � Herbert >>>>>> >>>>>> At 12:37 AM +0000 1/8/11, SIMON WESTRIP wrote: >>>>>>> >>>>>>> Dear Herbert >>>>>>> >>>>>>> I fail to see how the adoption of python string quoting rules is >>>>>>> going >>>>>>> to >>>>>>> make life easier for anyone other than a python programmer? >>>>>>> Even then, the mechanism is restricted to treble-quoted strings, >>>>>>> which are only >>>>>>> one part of CIF. Maybe I've missed something, but just because CIF >>>>>>> might share >>>>>>> common syntax with a programming language in one respect, does not >>>>>>> necessarily mean >>>>>>> that the tools of that medium are available to CIF? >>>>>>> >>>>>>> If you're looking to base CIF extensions on established mechanisms, >>>>>>> why not adopt >>>>>>> the minimal \(newline) and \\ escape sequences, which in essence are >>>>>>> the same as >>>>>>> the established CIF line-folding protocol (just dropping the initial >>>>>>> \ following the opening >>>>>>> delimiter and formalising the protocol as an inherent part of the >>>>>>> spec). Afterall, I beleive you >>>>>>> have already been using it, or at least interpreted it, as a means >>>>>>> to escape 'semicolon delimiters' within >>>>>>> semicolon-delimited values (I seem to recall discussions that >>>>>>> identified an issue with the published 'trip tests' >>>>>>> relating to line folding). >>>>>>> >>>>>>> Forgive me if I have missed something regarding the usefulness of >>>>>>> python in CIF; please enlighten me >>>>>>> as to its benefits if I were to write a CIF reader using anything >>>>>>> but python. As far as I can see, the only >>>>>>> advantages lie in the fact that the logic is established and thus >>>>>>> unquestionable; but that does not mean it is >>>>>>> necessarily entirely appropriate for CIF (which afterall isn't a >>>>>>> programming language). >>>>>>> >>>>>>> Cheers >>>>>>> >>>>>>> Simon >>>>>>> >>>>>>> >>>>>>> >>>>>>> From: Herbert J. Bernstein <[email protected]> >>>>>>> To: Group finalising DDLm and associated dictionaries >>>>>>> <[email protected]> >>>>>>> Sent: Friday, 7 January, 2011 23:07:40 >>>>>>> Subject: Re: [ddlm-group] Eliding in triple-quoted strings: >>>>>>> Proposals C and D. .. .. . >>>>>>> >>>>>>> Dear Colleagues, >>>>>>> >>>>>>> � Ralf's proposal is what it is. �Before we go haring off in other >>>>>>> directions, we should respond constructively to what he has proposed. >>>>>>> I support it. �Ralf and John W. support it. �John B. and James H. >>>>>>> oppose it. �I think they are mistaken because ... >>>>>>> >>>>>>> � It is well and good to adopt a "Real Programmers Don't Each >>>>>>> Quiche" let's-start-from-scratch-and-roll-our-own approach when >>>>>>> you have the resources to accomplish our goals that way. �It >>>>>>> is a lot of fun, and has the potential to truly advance the >>>>>>> field, but it is also, in the current funding climate, unrealistic. >>>>>>> >>>>>>> � In the U.S., there is a serious prospect to science funding being >>>>>>> cut back so severely that the hit rates on grants next year may >>>>>>> be as low as 1 in 10. �I suspect an honest review of funding >>>>>>> prospects >>>>>>> in other countries will uncover similarly dire warnings. >>>>>>> >>>>>>> � This does not mean we are all going out of buisness, but we do have >>>>>>> to be careful to conserve resources and focus our do-it-from-scratch >>>>>>> efforts on those areas that have the highest priority, and I fear, >>>>>>> for most of our community, CIF2, while important, is not likely to >>>>>>> be seen as worth that approach, and certainly filing the edges of >>>>>>> a brand-new treble quote spec is likely to be very far down >>>>>>> on anybody's priority list. >>>>>>> >>>>>>> Ralf has made a proposal that will save all of us a lot of effort >>>>>>> and allow us to devote more resources to higher priority problems. >>>>>>> >>>>>>> Not only is he right on this one point, but I urge us to look for >>>>>>> other areas where we can get to CIF2 by building on work that is >>>>>>> already done. >>>>>>> >>>>>>> This is not a good time for wheel-reinvention. >>>>>>> >>>>>>> I would appreciate knowing from those who wish to reinvent this >>>>>>> particular wheel, why they wish to do that and from where they >>>>>>> expect to get the resources to do it? >>>>>>> >>>>>>> Regards, >>>>>>> � Herbert >>>>>>> >>>>>>> ===================================================== >>>>>>> � Herbert J. Bernstein, Professor of Computer Science >>>>>>> � � Dowling College, Kramer Science Center, KSC 121 >>>>>>> � � � � Idle Hour Blvd, Oakdale, NY, 11769 >>>>>>> >>>>>>> � � � � � � � � � +1-631-244-3035 >>>>>>> � � � � � � � � � <mailto:[email protected]>[email protected] >>>>>>> ===================================================== >>>>>>> >>>>>>> On Fri, 7 Jan 2011, Bollinger, John C wrote: >>>>>>> >>>>>>>> >>>>>>>> �On Friday, January 07, 2011 3:14 PM, Herbert J. Bernstein wrote: >>>>>>>> >>>>>>>>> �We seem not to be communicating effectively. >>>>>>> >>>>>>> �>> >>>>>>>>> >>>>>>>>> �What I am asking for is an _existing_, supported treble quote >>>>>>>>> specification >>>>>>>>> �from an _existing_ language with _existing_ documentation and >>>>>>>>> �_existing_ software as an alternative to the Python specification, >>>>>>>>> �documentation and software to which we all have access, that is >>>>>>>>> being >>>>>>>>> �proposed as an alternative >>>>>>>>> �to what Ralf has proposed. >>>>>>>> >>>>>>>> �Thank you for that clarification. �You are right, I didn't >>>>>>>> understand >>>>>>>> �what you were asking for. >>>>>>>> >>>>>>>> �I hope this will likewise clarify my position: I reject the premise >>>>>>>> that >>>>>>>> �the system we choose must meet those criteria, and I oppose >>>>>>>> adopting >>>>>>>> the >>>>>>>> �full Python syntax and semantics. >>>>>>>> >>>>>>>>> �The Python specification is available at >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> <http://docs.python.org/reference/index.html>http://docs.python.org/reference/index.html >>>>>>>>> >>>>>>>>> �with the lexical analysis at >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> >>>>>>>>> <http://docs.python.org/reference/lexical_analysis.html>http://docs.python.org/reference/lexical_analysis.html >>>>>>>> >>>>>>>> �Thanks, though that is exactly what I was looking at already. �It >>>>>>>> leaves >>>>>>>> �several details unclear, some of which I discussed in previous >>>>>>>> messages. >>>>>>>> �Hence, I consider it slightly short of a *full* specification. �It >>>>>>>> does, >>>>>>>> �however, provide my grounds for opposing adoption of that scheme >>>>>>>> for >>>>>>>> �CIF. >>>>>>>> >>>>>>>>> �The complete source code and binaries are available at: >>>>>>>> >>>>>>>> �Unless you propose to append a particular set of Python sources to >>>>>>>> the >>>>>>>> �CIF specification as a reference, I have no interest in perusing >>>>>>>> the >>>>>>>> �source code to seek answers to such questions of detail as I have. >>>>>>>> �Furthermore, I would oppose adding such an appendix on the grounds >>>>>>>> that >>>>>>>> �it would be exceedingly difficult to use to resolve questions such >>>>>>>> as >>>>>>>> �mine. >>>>>>>> >>>>>>>> �I am likewise unwilling to rely on the behavior the python binary >>>>>>>> that >>>>>>>> �happens to be installed on my computer to answer them. �If the >>>>>>>> correct >>>>>>>> �behavior is not documented independent of the program then there is >>>>>>>> no >>>>>>>> �particular reason to trust that it won't change in future versions, >>>>>>>> or >>>>>>>> �that any particular implementation is correct or bug-free. >>>>>>>> >>>>>>>> >>>>>>>> �Regards, >>>>>>>> >>>>>>>> �John >>>>>>>> >>>>>>>> �-- >>>>>>>> �John C. Bollinger, Ph.D. >>>>>>>> �Department of Structural Biology >>>>>>>> �St. Jude Children's Research Hospital >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> �Email Disclaimer: >>>>>>>> >>>>>>>> <http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer >>>>>>>> >>>>>>>> �_______________________________________________ >>>>>>>> �ddlm-group mailing list >>>>>>>> �<mailto:[email protected]>[email protected] >>>>>>>> >>>>>>>> >>>>>>>> >>>>>>>> <http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>>>>>> >>>>>>> _______________________________________________ >>>>>>> ddlm-group mailing list >>>>>>> <mailto:[email protected]>[email protected] >>>>>>> >>>>>>> >>>>>>> <http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>>>>> >>>>>>> >>>>>>> _______________________________________________ >>>>>>> ddlm-group mailing list >>>>>>> [email protected] >>>>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>>>> >>>>>> >>>>>> -- >>>>>> ===================================================== >>>>>> �Herbert J. Bernstein, Professor of Computer Science >>>>>> � �Dowling College, Kramer Science Center, KSC 121 >>>>>> � � � � Idle Hour Blvd, Oakdale, NY, 11769 >>>>>> >>>>>> � � � � � � � � �+1-631-244-3035 >>>>>> � � � � � � � � �[email protected] >>>>>> ===================================================== >>>>>> _______________________________________________ >>>>>> ddlm-group mailing list >>>>>> [email protected] >>>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>>>> >>>>> >>>>> >>>>> >>>>> -- >>>>> T +61 (02) 9717 9907 >>>>> F +61 (02) 9717 3145 >>>>> M +61 (04) 0249 4148 >>>>> _______________________________________________ >>>>> ddlm-group mailing list >>>>> [email protected] >>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>> >>> _______________________________________________ >>> ddlm-group mailing list >>> [email protected] >>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>> >>> >> >> >> >> -- >> T +61 (02) 9717 9907 >> F +61 (02) 9717 3145 >> M +61 (04) 0249 4148 >> _______________________________________________ >> ddlm-group mailing list >> [email protected] >> http://scripts.iucr.org/mailman/listinfo/ddlm-group > > _______________________________________________ > ddlm-group mailing list > [email protected] > http://scripts.iucr.org/mailman/listinfo/ddlm-group > > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Simon's elide proposal (Herbert J. Bernstein)
- References:
- [ddlm-group] Simon's elide proposal (James Hester)
- Re: [ddlm-group] Simon's elide proposal (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] Simon's elide proposal
- Next by Date: [ddlm-group] Focusing the elide discussion
- Prev by thread: Re: [ddlm-group] Simon's elide proposal
- Next by thread: Re: [ddlm-group] Simon's elide proposal
- Index(es):