[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] Drafting issues


On Friday, October 01, 2010 9:10 AM, I wrote:

>I think with that we have reached an acceptable position.  I do
>propose three editorial changes, however, that I intend to clarify
>the wording without changing its meaning in any way:

Here is specific proposed wording that realizes my suggestions, keeping everything in the same section rather than moving anything to an annex:

====
CIF2 files are standard variable length plain text files, which for compatibility with older processing systems will have a maximum line length of 2048 characters. As discussed above and below, however, there are some restrictions on the character set for token delimiters, separators and data names.

For compatibility with CIF1 behaviour, there is no formal restriction on the encoding of CIF2 files, providing they contain only code points from the ASCII range.  If a CIF2 file contains characters equivalent to Unicode code points greater than U+0077 (127 decimal), then the particular encoding used must either be UTF8 or algorithmically identifiable from the CIF2 file itself.  Acceptable identification algorithms will be published as necessary as annexes to this standard (see description of magic code and encoding disambiguation in Change 1).  Annexes notwithstanding,
(i) a CIF2 file containing characters outside the ASCII range with no BOM and no disambiguation signature will be a UTF8 file, and
(ii) a CIF2 file containing characters outside the ASCII range with a valid UTF8 or UTF16 BOM and no disambiguation signature, will be a Unicode file written in the indicated encoding.

The use of a BOM for Unicode encodings, including UTF8, is recommended.
====

Regards,

John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital


Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
cif2-encoding mailing list
cif2-encoding@iucr.org
http://scripts.iucr.org/mailman/listinfo/cif2-encoding

Reply to: [list | sender only]