Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Alternative proposal for eliding. .

On Tuesday, June 07, 2011 10:01 PM, James Hester wrote:
>I agree that misreading of a legacy file without incurring a parsing error is practically impossible.
>
>We should, however, make it possible in CIF2 to present multiline values containing a backslash before the first <eol> without risking a parsing error on read when this <backslash> is misunderstood as a prefix flag.
>
>I suggest the following rule be added to the Grazulis proposal:
>(Rule one) The <eol> at the end of the first line of all <eol><semicolon> delimited values does not form part of the data value.
>
>This works as follows: when encoding a datavalue inside an <eol><semicolon> delimited string, a simple output routine would always insert an <eol> immediately after the <semicolon>, unless it wishes to use the prefix and/or line folding conventions.  On reading an <eol><semicolon> >string, this first <eol> is always discarded.


Doesn't the line-folding protocol already achieve this objective?  If we formulate the Grazulis protocol as an extension or superset of the line-folding protocol (so that line-folding is always available where Grazulis is), then I don't think we need to add the proposed rule to it.  To avoid misinterpretation, a text block containing a backslash at the end of its first line would be expressed so:

_example
;\
backslash       \\

slash           /
semicolon       ;
;

Furthermore, by declining to add the new rule, we do not (further) change the meaning of existing CIFs.  Although, as David observed, the legacy CIFs that might be misinterpreted under the Grazulis protocol are surely few, the CIFs whose meaning would be changed by the proposed additional rule are very many, and the changes will often be significant.  Consider

_example
;I contain no
misspellings.
;

, which should not be parsed equivalently to

_example 'I contain nomisspellings.'

Consider also text blocks containing source code in languages that are sensitive to line breaks, such as Fortran or Python.


John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital



Email Disclaimer:  www.stjude.org/emaildisclaimer
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.