[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Handling single string values longer than maximumline length

To: Group finalising DDLm and associated dictionaries <[email protected]>
Subject: Re: [ddlm-group] Handling single string values longer than maximumline length
From: Joe Krahn <[email protected]>
Date: Thu, 26 Nov 2009 20:50:29 -0500
In-Reply-To: <C7343400.1262C%[email protected]>
References: <C7343400.1262C%[email protected]>

Nick Spadaccini wrote:
> I don't find the necessity for line folding a convincing argument, but so
> long as I don't have to worry about it when parsing a file, I am not fussed.
> 
> Line-folding has to exist for an 80 byte restriction, because the
> restriction is ludicrous. STAR has no restriction, CIFx has 2048 bytes
> (still silly but imposed by outside factors). One may have data values
> longer than 2048 (I have yet to see any), and sequencing data perhaps will
> fall in to this category. But if John doesn't (seemingly) understand the
> line folding issues, I am guessing the PDB doesn't employ it. If the
> custodians of macromolecular data and (presumably) sequencing data have a
> solution that does not require the convoluted line folding operations
> specified on the IUCr website, then who does?
I agree that folding just to avoid long lines is not that important. It 
is mostly a line-oriented I/O work-around, which some current Fortran 
software still needs for the near future. However, some people might 
want folded lines just to make it easier to view CIF files in an editor. 
  I am interested mainly because folding can be used to elide triple-quotes.
> 
> I see Joe has already made the mistake of thinking that
> Xxxx\
> ;
> 
> Means the trailing ; is not a token delimiter. Well every other line-folding
> convention would conclude that, but the IUCr interpretation is that the
> trailing ; DOES terminate the string, and that last \ is actually stripping
> off the final \n (which isn't there anyway because that got stripped off as
> part of the lexing process -  the string terminators are supposed to be
> removed).
> 
> OR I have completely misunderstood the line folding protocol and the example
> on the IUCr webpage is wrong? I am not sure which.
I think you are right. was confused by Herb's example

;\
;\
;

which is the same as ";" in CIF 1.1. The middle semicolon is not a 
terminator due to the subsequent '\', because close-quotes are valid 
only if followed by whitespace. I didn't know that applied to semicolon 
delimited strings.

The current rule above does not make a lot of sense. How can \ strip off 
the \n when it is really part of the "\n;" close-quote characters? Maybe 
it was done to simplify a software issue?

> 
> Either way do we all agree that the line folding is not a lexer issue?
I agree, but an implementation should be able to unfold/fold lines at 
the low-level I/O. The important point is to make sure the syntax is 
defined such that the lexer does not need to know about folding.

Joe

_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]

Follow-Ups:

Re: [ddlm-group] Handling single string values longer than maximumline length (James Hester)

References:

Re: [ddlm-group] Handling single string values longer than maximumline length (Nick Spadaccini)

Prev by Date: Re: [ddlm-group] Use of elides in strings

Next by Date: [ddlm-group] Close quotes not followed by whitespace

Prev by thread: Re: [ddlm-group] Handling single string values longer than maximumline length

Next by thread: Re: [ddlm-group] Handling single string values longer than maximumline length

Index(es):

Date

Thread

Discussion List Archives

Re: [ddlm-group] Handling single string values longer than maximumline length