[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- From: James Hester <[email protected]>
- Date: Wed, 30 Jun 2010 10:42:15 +1000
- In-Reply-To: <[email protected]>
- References: <[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><[email protected]><8F77913624F7524AACD2A92EAF3BFA54166122952D@SJMEMXMBS11.stjude.sjcrh.local><[email protected]><8F77913624F7524AACD2A92EAF3BFA541661229533@SJMEMXMBS11.stjude.sjcrh.local><[email protected]>
Herbert: while I'm mindful of your wish to discuss this with the wider community, John B's comments over the last few days I think have been helpful in moving the discussion along -or at least the one going on in my head - so I'd like to explore this a bit before attempting to engage the wider community. And John B's suggested improvements to my summary should be incorporated in any case. James. On Wed, Jun 30, 2010 at 1:16 AM, Herbert J. Bernstein <[email protected]> wrote: > I would like to see this matter brought to the community as a whole > to discuss and decide. �-- Herbert > > ===================================================== > �Herbert J. Bernstein, Professor of Computer Science > � �Dowling College, Kramer Science Center, KSC 121 > � � � � Idle Hour Blvd, Oakdale, NY, 11769 > > � � � � � � � � �+1-631-244-3035 > � � � � � � � � �[email protected] > ===================================================== > > On Tue, 29 Jun 2010, Bollinger, John C wrote: > >> >> >> On Monday, June 28, 2010 6:00 PM, SIMON WESTRIP wrote: >> >>> John suggests "the goal of CIF being compatible with general-purpose text tools" >>> >>> This is possibly the crux of the matter. >> >> It is right at the heart of the matter, I agree, and it comes with an historical impetus. �As I composed these comments, I distilled what I think are the essences of the two main positions into two short statements that capture, for me, the alternatives before us. �Please forgive the somewhat didactic discussion leading up to these, and skip straight to the *** if you wish to ignore my long-windedness altogether. >> >>> Unless a general-purpose text tool is capable of the determining text encoding system, it ain't going to be much use >>> for a CIF that was encoded on a different system and uses non-ASCII chars? >> >> Forgive me if I am reading too much into the question, but I think it highlights a central difference of understanding: some parties to this discussion seem to hold that text vs. binary is an inherent characteristic of a file, but I maintain that a stream of bytes divorced from any explicit or implicit metadata about its encoding is binary, not text. �This complication of electronic text handling is not new, but it has assumed much more prominence as internationalization issues have gained importance. >> >> Implicit encoding metadata commonly takes the form of the text in question being encoded according to the default scheme for the system or tool. �It could, in one sense, also take the form of a requirement in the format specification, but that is meaningful only for tools specific to the format, which rather moots the text vs. binary question. �It could also take the form of local policy, such as "all CIFs in this archive are encoded in CESU-8," which would be useful to tools configured for the relevant environment (e.g. a web server). >> >> Explicit metadata can be carried by the file itself or conveyed out-of-band. �XML's encoding attribute is an example of the former, and HTTP's content-type header is an example of the latter. �These are useful only to certain tools, specific to a particular format, environment, or exchange mechanism. >> >> One of the upshots of all this is that transcoding must in general be a routine aspect of text file exchange, as that can make explicit encoding metadata implicit. �As Simon has shown, transcoding not automatic in many contexts, so it may require extra work on the receiving end. �To the extent that there is a current assumption and practice of CIFs being stored and forwarded byte-for-byte as received (i.e. without transcoding or explicit metadata), CIF is already being treated as a binary format. �In a sense, perhaps, it is being treated simultaneously as several distinct binary formats. >> >> >> *** >> >>> By extending the character set beyond ASCII, we have to accept that not all general-purpose text tools are going to >>> be applicable as CIF editors/viewers. >> >> That's a valid perspective, but I would sharpen it: as part of extending the character set beyond ASCII, we abandon the premise that CIF is a text format, though under some circumstances it may still be possible to manipulate CIFs with tools designed for text. >> >> Alternatively, I have been advocating essentially this: by extending the character set beyond ASCII, we magnify the importance of exchanging and storing CIFs according to text conventions, including correctly communicating encodings as necessary and transcoding as appropriate. >> >> I hope the latter position adequately encompasses Herb's view as well. �Each position carries additional baggage, which I have omitted to focus on the essential ideas. �If wider comment is sought, then I submit that these alternatives provide a suitable basis for soliciting such. >> >> >> Whichever position prevails, I should like to see something substantially similar to the corresponding position statement above be inserted into the spec. >> >> >> Regards, >> >> John >> -- >> John C. Bollinger, Ph.D. >> Department of Structural Biology >> St. Jude Children's Research Hospital >> >> >> >> >> Email Disclaimer: �www.stjude.org/emaildisclaimer >> _______________________________________________ >> ddlm-group mailing list >> [email protected] >> http://scripts.iucr.org/mailman/listinfo/ddlm-group >> > _______________________________________________ > ddlm-group mailing list > [email protected] > http://scripts.iucr.org/mailman/listinfo/ddlm-group > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. . (Herbert J. Bernstein)
- References:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. . (SIMON WESTRIP)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .. .. .. .. . (Bollinger, John C)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. . (Herbert J. Bernstein)
- Prev by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- Prev by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- Next by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .. .. .. .. .
- Index(es):