[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] options/text vs binary/end-of-line. .. .

On Monday, June 21, 2010 1:13 AM, James Hester wrote:

>I prefer the XML treatment of newline (ie translated to 0x000A for
>processing purposes).  I would be in favour of restricting newline to
><0x000A>, <0x000D> or <0x000D 0x000A>, which means that only these
>combinations have the syntactic significance of a newline.

I would be satisfied with that approach.

> From
>memory, this significance is restricted to:
>
>1. end of comment
>2. whitespace
>3. use in <eol><semicolon> digraph

The significance also extends to 'single'- and "double"-quote delimited data values, in that these cannot contain end-of-line.

>I would also restrict the appearance of the remaining Unicode newline
>characters to delimited datavalues, to maintain consistent display of
>data files.

I'm seeing more and more upside to restricting *all* non-ASCII characters to delimited data values.  I don't have any objection to restricting U+0085, U+2028, and U+2029 (did I miss any?) to such contexts.


John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital




Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]