Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Alternative proposal for eliding. .

Yes, you are right.  I withdraw my proposal for an additional rule.

On Thu, Jun 9, 2011 at 1:05 AM, Bollinger, John C <John.Bollinger@stjude.org> wrote:
On Tuesday, June 07, 2011 10:01 PM, James Hester wrote:
>I agree that misreading of a legacy file without incurring a parsing error is practically impossible.
>
>We should, however, make it possible in CIF2 to present multiline values containing a backslash before the first <eol> without risking a parsing error on read when this <backslash> is misunderstood as a prefix flag.
>
>I suggest the following rule be added to the Grazulis proposal:
>(Rule one) The <eol> at the end of the first line of all <eol><semicolon> delimited values does not form part of the data value.
>
>This works as follows: when encoding a datavalue inside an <eol><semicolon> delimited string, a simple output routine would always insert an <eol> immediately after the <semicolon>, unless it wishes to use the prefix and/or line folding conventions.  On reading an <eol><semicolon> >string, this first <eol> is always discarded.


Doesn't the line-folding protocol already achieve this objective?  If we formulate the Grazulis protocol as an extension or superset of the line-folding protocol (so that line-folding is always available where Grazulis is), then I don't think we need to add the proposed rule to it.  To avoid misinterpretation, a text block containing a backslash at the end of its first line would be expressed so:

_example
;\
backslash       \\

slash           /
semicolon       ;
;

Furthermore, by declining to add the new rule, we do not (further) change the meaning of existing CIFs.  Although, as David observed, the legacy CIFs that might be misinterpreted under the Grazulis protocol are surely few, the CIFs whose meaning would be changed by the proposed additional rule are very many, and the changes will often be significant.  Consider

_example
;I contain no
misspellings.
;

, which should not be parsed equivalently to

_example 'I contain nomisspellings.'

Consider also text blocks containing source code in languages that are sensitive to line breaks, such as Fortran or Python.


John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital



Email Disclaimer:  www.stjude.org/emaildisclaimer
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.