[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .. .. .. .. .
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .. .. .. .. .
- From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG>
- Date: Tue, 29 Jun 2010 10:06:33 -0500
- Accept-Language: en-US
- acceptlanguage: en-US
- In-Reply-To: <33483.93964.qm@web87012.mail.ird.yahoo.com>
- References: <AANLkTilyJE2mCxprlBYaSkysu1OBjY7otWrXDWm3oOT9@mail.gmail.com><AANLkTikih0j6-vyLDPMOqcTkoiK545yE28y4fU9JTUa2@mail.gmail.com><20100623103310.GD15883@emerald.iucr.org><alpine.BSF.2.00.1006231033360.56372@epsilon.pair.com><alpine.BSF.2.00.1006231406010.30894@epsilon.pair.com><a06240802c848414681ef@192.168.2.104><381469.52475.qm@web87004.mail.ird.yahoo.com><a06240801c84949b70cb7@192.168.27.100><AANLkTilZj2UEffRwmvCrgnVbxrGwmsoqb9S7tw31MWSo@mail.gmail.com><984921.99613.qm@web87011.mail.ird.yahoo.com><AANLkTimLmnpS-HHP9en-zwUDeVKtbHSUJa36tUCOlQtL@mail.gmail.com><826180.50656.qm@web87010.mail.ird.yahoo.com> <a06240803c84a8e4d89fc@[192.168.2.104]><563298.52532.qm@web87005.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA54166122952C@SJMEMXMBS11.stjude.sjcrh.local ><520427.68014.qm@web87001.mail.ird.yahoo.com><a06240800c84ac1b696bf@[192.168.2.104]><614241.93385.qm@web87016.mail.ird.yahoo.com><alpine.BSF.2.00.1006251827270.70846@epsilon.pair.com> <663654.63888.qm@web87001.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA54166122952D@SJMEMXMBS11.stjude.sjcrh.local><33483.93964.qm@web87012.mail.ird.yahoo.com>
On Monday, June 28, 2010 6:00 PM, SIMON WESTRIP wrote: >John suggests "the goal of CIF being compatible with general-purpose text tools" > >This is possibly the crux of the matter. It is right at the heart of the matter, I agree, and it comes with an historical impetus. As I composed these comments, I distilled what I think are the essences of the two main positions into two short statements that capture, for me, the alternatives before us. Please forgive the somewhat didactic discussion leading up to these, and skip straight to the *** if you wish to ignore my long-windedness altogether. >Unless a general-purpose text tool is capable of the determining text encoding system, it ain't going to be much use >for a CIF that was encoded on a different system and uses non-ASCII chars? Forgive me if I am reading too much into the question, but I think it highlights a central difference of understanding: some parties to this discussion seem to hold that text vs. binary is an inherent characteristic of a file, but I maintain that a stream of bytes divorced from any explicit or implicit metadata about its encoding is binary, not text. This complication of electronic text handling is not new, but it has assumed much more prominence as internationalization issues have gained importance. Implicit encoding metadata commonly takes the form of the text in question being encoded according to the default scheme for the system or tool. It could, in one sense, also take the form of a requirement in the format specification, but that is meaningful only for tools specific to the format, which rather moots the text vs. binary question. It could also take the form of local policy, such as "all CIFs in this archive are encoded in CESU-8," which would be useful to tools configured for the relevant environment (e.g. a web server). Explicit metadata can be carried by the file itself or conveyed out-of-band. XML's encoding attribute is an example of the former, and HTTP's content-type header is an example of the latter. These are useful only to certain tools, specific to a particular format, environment, or exchange mechanism. One of the upshots of all this is that transcoding must in general be a routine aspect of text file exchange, as that can make explicit encoding metadata implicit. As Simon has shown, transcoding not automatic in many contexts, so it may require extra work on the receiving end. To the extent that there is a current assumption and practice of CIFs being stored and forwarded byte-for-byte as received (i.e. without transcoding or explicit metadata), CIF is already being treated as a binary format. In a sense, perhaps, it is being treated simultaneously as several distinct binary formats. *** >By extending the character set beyond ASCII, we have to accept that not all general-purpose text tools are going to >be applicable as CIF editors/viewers. That's a valid perspective, but I would sharpen it: as part of extending the character set beyond ASCII, we abandon the premise that CIF is a text format, though under some circumstances it may still be possible to manipulate CIFs with tools designed for text. Alternatively, I have been advocating essentially this: by extending the character set beyond ASCII, we magnify the importance of exchanging and storing CIFs according to text conventions, including correctly communicating encodings as necessary and transcoding as appropriate. I hope the latter position adequately encompasses Herb's view as well. Each position carries additional baggage, which I have omitted to focus on the essential ideas. If wider comment is sought, then I submit that these alternatives provide a suitable basis for soliciting such. Whichever position prevails, I should like to see something substantially similar to the corresponding position statement above be inserted into the spec. Regards, John -- John C. Bollinger, Ph.D. Department of Structural Biology St. Jude Children's Research Hospital Email Disclaimer: www.stjude.org/emaildisclaimer _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .. .. .. .. . (Brian McMahon)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. . (Herbert J. Bernstein)
- References:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. . (Brian McMahon)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. . (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] Summary of encoding discussion so far. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- Prev by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .
- Next by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- Index(es):