Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Handling single string values longer than maximumline length

Hi Joe - perhaps you missed the emails from earlier this week, but Herb and I agree that his example is syntactically incorrect, as the line folding character never escapes the terminating \n; , and no space is required after a \n; in order for it to be a valid delimiter.  You may refer to ITG section 2.2.7.4.11, second last paragraph of p35 for explicit confirmation of this.

The upshot is that Herb's example reduces to an empty \n; delimited string, followed by a syntactically incorrect reverse solidus-newline-semicolon, and your intuition about the way it "should" behave corresponds with how it really does behave.

On Fri, Nov 27, 2009 at 12:50 PM, Joe Krahn <krahn@niehs.nih.gov> wrote:
Nick Spadaccini wrote:
> I don't find the necessity for line folding a convincing argument, but so
> long as I don't have to worry about it when parsing a file, I am not fussed.
>
> Line-folding has to exist for an 80 byte restriction, because the
> restriction is ludicrous. STAR has no restriction, CIFx has 2048 bytes
> (still silly but imposed by outside factors). One may have data values
> longer than 2048 (I have yet to see any), and sequencing data perhaps will
> fall in to this category. But if John doesn't (seemingly) understand the
> line folding issues, I am guessing the PDB doesn't employ it. If the
> custodians of macromolecular data and (presumably) sequencing data have a
> solution that does not require the convoluted line folding operations
> specified on the IUCr website, then who does?
I agree that folding just to avoid long lines is not that important. It
is mostly a line-oriented I/O work-around, which some current Fortran
software still needs for the near future. However, some people might
want folded lines just to make it easier to view CIF files in an editor.
 I am interested mainly because folding can be used to elide triple-quotes.
>
> I see Joe has already made the mistake of thinking that
> Xxxx\
> ;
>
> Means the trailing ; is not a token delimiter. Well every other line-folding
> convention would conclude that, but the IUCr interpretation is that the
> trailing ; DOES terminate the string, and that last \ is actually stripping
> off the final \n (which isn't there anyway because that got stripped off as
> part of the lexing process -  the string terminators are supposed to be
> removed).
>
> OR I have completely misunderstood the line folding protocol and the example
> on the IUCr webpage is wrong? I am not sure which.
I think you are right. was confused by Herb's example

;\
;\
;

which is the same as ";" in CIF 1.1. The middle semicolon is not a
terminator due to the subsequent '\', because close-quotes are valid
only if followed by whitespace. I didn't know that applied to semicolon
delimited strings.

The current rule above does not make a lot of sense. How can \ strip off
the \n when it is really part of the "\n;" close-quote characters? Maybe
it was done to simplify a software issue?

>
> Either way do we all agree that the line folding is not a lexer issue?
I agree, but an implementation should be able to unfold/fold lines at
the low-level I/O. The important point is to make sure the syntax is
defined such that the lexer does not need to know about folding.

Joe

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.