[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Vote on BOM

It seems that not all are agreed that CIF2 encoding is UTF-8. Multiple encodings would influence my vote on the UTF-8 BOM.

Simon



From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
To: ddlm-group <ddlm-group@iucr.org>
Sent: Friday, 18 June, 2010 16:33:01
Subject: Re: [ddlm-group] Vote on BOM


As far as I am aware, I do not have voting rights here, not formally being a member of the DDLm working group.  If I did have, these would be my votes (and feel free to count them anyway ;-) ):

>1. Treatment of UTF8 BOM as first three bytes of a CIF2 file
>    (a) Syntax error/Non CIF2 file
>    (b) UTF8-BOM followed by #\#CIF2.0 is a valid CIF2 magic number

I favor Herb's position that CIF2 should be defined as a Unicode text format, in which context encoding would be out of scope.  Thus an initial BOM should be allowed and handled by the decoder (or simply allowed by a parser that attempts to defer decoding).  This assumes that the processor supports UTF-8, which I would be satisfied to make a non-exclusive requirement on CIF2 processors.

(b), more or less.

>2. Treatment of UTF8 BOM in a CIF file, other than as the first three bytes:
>    (a) Always a syntax error
>    (b) Syntactic whitespace
>    (c) An ordinary character:
>          (i) May appear only in delimited data values and comments
>          (ii) May appear anywhere other ordinary characters can
>appear (i.e. including datanames, datablock names etc.)
>    (d) Silently ignored

(c)(i)

>3. Treatment of UCS BOM in a CIF file
>  (a) Syntax error
>  (b) Encoding switch

Inasmuch as I favor defining CIF as a text format, these alternatives do not make sense, as they relate to encoding details.  I am against CIF requiring processors to support encoding schemes that provide for embedded encoding switches, but I am perfectly satisfied for CIF to *allow* processors to support such schemes.  That amounts to

(c) Encoding scheme dependent


John
--
John C. Bollinger, Ph.D.
Department of Structural Biology
St. Jude Children's Research Hospital


Email Disclaimer:  www.stjude.org/emaildisclaimer

_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]