Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] Drafting issues

On Friday, October 01, 2010 9:10 AM, I wrote:

>I think with that we have reached an acceptable position.  I do
>propose three editorial changes, however, that I intend to clarify
>the wording without changing its meaning in any way:

Here is specific proposed wording that realizes my suggestions, keeping everything in the same section rather than moving anything to an annex:

CIF2 files are standard variable length plain text files, which for compatibility with older processing systems will have a maximum line length of 2048 characters. As discussed above and below, however, there are some restrictions on the character set for token delimiters, separators and data names.

For compatibility with CIF1 behaviour, there is no formal restriction on the encoding of CIF2 files, providing they contain only code points from the ASCII range.  If a CIF2 file contains characters equivalent to Unicode code points greater than U+0077 (127 decimal), then the particular encoding used must either be UTF8 or algorithmically identifiable from the CIF2 file itself.  Acceptable identification algorithms will be published as necessary as annexes to this standard (see description of magic code and encoding disambiguation in Change 1).  Annexes notwithstanding,
(i) a CIF2 file containing characters outside the ASCII range with no BOM and no disambiguation signature will be a UTF8 file, and
(ii) a CIF2 file containing characters outside the ASCII range with a valid UTF8 or UTF16 BOM and no disambiguation signature, will be a Unicode file written in the indicated encoding.

The use of a BOM for Unicode encodings, including UTF8, is recommended.


John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital

Email Disclaimer:  www.stjude.org/emaildisclaimer

cif2-encoding mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.