Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Draft CIF2 standard available

Hi All,

A few more comments about CIF2 (I hope you're not tiring of them!):

(*) Change 1 / Change 2: I think it would be wise to specify that if a Unicode byte order mark (U+FEFF) appears (UTF-8 encoded) at the beginning of a CIF2 file then it is not considered part of the CIF content.  Some text editors will insert these automatically (even though they are not required if the text is UTF-8 encoded), and that practice is permitted by Unicode even though it is not recommended.  This particularly impacts parsers that attempt to defer UTF-8 decoding until after lexical analysis, or that make it an application responsibility. It also may affect recognition of a CIF2 file by its initial magic comment.

(*) Paragraph 42 of the CIF 1.1 syntax spec permits CIF processors to normalize line break sequences, including within data values, in the same way that XML 1.0 processors are required to do.  XML 1.1 extends the list of line termination sequences that an XML 1.1 processor must normalize (http://www.w3.org/TR/2006/REC-xml11-20060816/#sec-line-ends).  The draft CIF2 spec expressly forbids the additional normalizations of XML 1.1 from being used in CIF2 in any syntactically significant way.  I find that CIF2 limitation unfortunate, but I am not much interested in debating the point.  However, may CIF2 processors at least be permitted to perform the expanded set of normalizations on data values?

(*) In a previous comment, I claimed that the greatest currently-assigned Unicode code point was less than 10FFF(hex).  This is incorrect, hence I now assert that U+10FFF as the upper limit of accepted CIF2 characters is either a typo or a mistake.

And I think that's it.  I'm not planning to perform any further analysis of the current CIF2 draft, unless in conjunction with discussion of one of the points I have raised.

Best Regards,

John

--
John C. Bollinger, Ph.D.
Computing and X-Ray Scientist
Department of Structural Biology
St. Jude Children's Research Hospital
John.Bollinger@StJude.org
www.stjude.org




Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://scripts.iucr.org/mailman/listinfo/cif-developers

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.