[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Handling single string values longer than maximumline length
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Handling single string values longer than maximumline length
- From: Joe Krahn <krahn@niehs.nih.gov>
- Date: Wed, 25 Nov 2009 16:08:03 -0500
- In-Reply-To: <20091125151844.GA14826@emerald.iucr.org>
- References: <20091125151844.GA14826@emerald.iucr.org>
If we allow for line-folding to occur after parsing, and backslash (aka reverse solidus) is not used as an elide/escape character, then no special rules are needed to put line-folded strings within triple-quoted strings. Semicolon-delimited strings are a special case because folding is considered in preventing a subsequent semicolon to be interpreted as beginning on a new line, and not accepted as the close-quote. However, if the lexer does not unfold long strings, this actually is an elide mechanism. The consensus seems to be heading towards no elide mechanism, so maybe breaking long lines just before a semicolon should simply be prohibited. ALSO, SOMETHING I JUST REALIZED: Line folding makes it possible to break an embedded triple-quote into two parts, so it actually provides a way to elide triple quotes indirectly within a large triple-quote string. Therefore, as long as CIF2 keeps line-folding, it is trivial to put CIF-within-CIF, or any other unrestricted string, without any of the reverse-solidus elide rules. Joe Brian McMahon wrote: >> (I've switched the thread title to deal separately with line folding.) > > Well, I didn't because I was distracted when about to hit the > 'Send' button! So this is just a repeat of the previous posting but > under a new thread in case we wish to take up this general discussion > later. > > Regards > Brian > > As Herbert says, line folding is part of the CIF 1.1 spec (pages 34-35 > of the ITG bible). Currently, it invokes a special meaning for the > backslash (reverse solidus) character, but only when it is the first > non-blank after an opening semicolon or comment hash delimiter. We have > yet to discuss whether to extend it to other string types (specifically > the triple-quoted strings). > > It's quite easy these days to generate single strings that are longer > than 2048 characters (or any other arbitrary line limit) - e.g. a > protein or nucleic acid sequence. Many, many chemical names broke the old > 80-character line length limit. > > We're very happy with CIF applications that do not interpret the > line-folding protocol, so long as they preserve the existing backslashes. > However, a fully-compliant CIF 1.1 parser should be able to return an > unfolded string to an application that requests it. > > As Herbert says, if this were dropped as part of the CIF2 specification, > we would need to think carefully about how else to retain this > functionality. > > Regards > Brian > > On Wed, Nov 25, 2009 at 07:54:51AM -0500, Herbert J. Bernstein wrote: >> The line folding protocol was discussed and adopted by COMCIFS and is >> posted, aong with other "Common Semantic Features" at >> >> http://www.iucr.org/resources/cif/spec/version1.1/semantics >> >> but that is neither here nor there. The point is that the IUCr uses CIF >> to get work done. If we disable something they are using, we should offer >> some equivalent functionality so they can use CIF 2 to do their work. >> Otherwise, they will have to do the sensible thing, and continue to use >> CIF 1, or, worse, create their own dialect of CIF 2. >> >> Now, I broke my nose yesterday morning and find myself a bit punchy today, >> so I will drop out of this discussion for a while. Hopefully, when I >> return to it, this whole matter will be settled in some way that will >> allow people to actually use CIF 2, instead of it becoming what it seems >> on its way to becoming -- something elegant but not terrible useful, a bit >> like PL/I. >> >> Cheers, >> Herbert >> >> ===================================================== >> Herbert J. Bernstein, Professor of Computer Science >> Dowling College, Kramer Science Center, KSC 121 >> Idle Hour Blvd, Oakdale, NY, 11769 >> >> +1-631-244-3035 >> yaya@dowling.edu >> ===================================================== > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- References:
- Prev by Date: Re: [ddlm-group] Use of elides in strings
- Next by Date: Re: [ddlm-group] Use of elides in strings
- Prev by thread: [ddlm-group] Handling single string values longer than maximum linelength
- Next by thread: Re: [ddlm-group] Handling single string values longer than maximumline length
- Index(es):