[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .
- From: James Hester <[email protected]>
- Date: Wed, 23 Jun 2010 11:04:45 +1000
- In-Reply-To: <8F77913624F7524AACD2A92EAF3BFA54166122951E@SJMEMXMBS11.stjude.sjcrh.local>
- References: <[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><8F77913624F7524AACD2A92EAF3BFA54166122951E@SJMEMXMBS11.stjude.sjcrh.local>
Thanks John for putting in the effort to come up with a decent compromise proposal. I would add something along the lines of 'Compliant CIF2 processors should at a minimum be able to deal with files in CIF interchange format'. And somewhere I would really like to warn people of the dangers of using anything else for storage. But I think I could live with what you've come up with, as it looks like I'm unlikely to get support for anything more restrictive. On Wed, Jun 23, 2010 at 1:15 AM, Bollinger, John C <[email protected]> wrote: > > I prefer leaving the issue of character encoding entirely out of the scope of the CIF format specification (effectively allowing any encoding). �On the other hand, I think it's a bit of an aggrandizement to characterize UTF-16 / Shift-JIS / etc. as "ways in which many of our colleagues get their science done." �In no way do I dispute that many of our colleagues indeed use these encodings routinely, but I am doubtful that editing Unicode text with a text editor constitutes a significant part of many of their research programs. �At least, few of my English-speaking colleagues edit flat Unicode text files with any frequency, if ever they do at all. > > I think there is already good software, some of it free (both senses), for operating systems at least as old as Windows 9x, that supports editing UTF-8 encoded text. �Most of it also supports a multitude of other encodings. �We would leave no one out by requiring UTF-8, and I do not see that respect for our colleagues demands that CIF2 be equally convenient to create and edit with every text editor in current use. �If that is doubtful, however, and respect is our goal, then wouldn't the most respectful thing be to *ask* a few of the people about whom we are concerned? > > My issue here is different, and at least partly philosophical. �The CIF format can and should be about the structure and meaning of CIF text content. �Character encoding is on a different level: it's a characteristic of storage and interchange. �Comingling these layers is inelegant and unnecessary. > > Moreover, a CIF2 requirement to encode in UTF-8 will be small comfort when presented with a file that is not, in fact, encoded that way. �What can you then do? �Either reject the file or autodetect the encoding. �If CIF2 does not specify a particular encoding, and you receive the same file, then what can you do? �Exactly the same things, but then it's more likely that the file's provider will have also specified the encoding by some means. �(Particularly so if the CIF2 spec calls attention to the need to do so.) > > Perhaps something like this would be an acceptable compromise: > a) Rewrite change 2 to remove the requirement for UTF-8 > b) Add: > ==== > CHANGE 9 - NEW (CIF Interchange Format) > > Many alternative encodings are available for recording and exchanging Unicode character data via byte-oriented media. �The CIF format itself is encoding independent, but that allows for uncertainty as to how to handle putative CIF data unaccompanied by encoding information. �We therefore define a simple, binary CIF Interchange Format, consisting of CIF2 text encoded in UTF-8, with an optional initial UTF-8 byte-order mark. �CIF Interchange Format is intended as a storage and interchange standard for CIF2. �Its use is strongly encouraged, but its existence should not be taken as a prohibition against use of alternative storage and interchange formats among agreeing parties. > > The standard file name extension for CIF Interchange Format files is .cif. > ==== > > > Regards, > > John > -- > John C. Bollinger, Ph.D. > Department of Structural Biology > St. Jude Children's Research Hospital > > > Email Disclaimer: �www.stjude.org/emaildisclaimer > > _______________________________________________ > ddlm-group mailing list > [email protected] > http://scripts.iucr.org/mailman/listinfo/ddlm-group > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. . (Brian McMahon)
- References:
- Re: [ddlm-group] [SPAM] ASSP UTF-8 BOM (Herbert J. Bernstein)
- Re: [ddlm-group] UTF-8 BOM (Herbert J. Bernstein)
- Re: [ddlm-group] UTF-8 BOM (James Hester)
- Re: [ddlm-group] UTF-8 BOM (Herbert J. Bernstein)
- Re: [ddlm-group] UTF-8 BOM (James Hester)
- Re: [ddlm-group] UTF-8 BOM (Herbert J. Bernstein)
- [ddlm-group] options/text vs binary/end-of-line (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. . (Bollinger, John C)
- Prev by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .
- Prev by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .
- Next by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .
- Index(es):