[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .
- From: David Brown <[email protected]>
- Date: Thu, 24 Jun 2010 10:29:58 -0400
- In-Reply-To: <[email protected]>
- References: <[email protected]> <[email protected]> <[email protected]> <[email protected]> <8F77913624F7524AACD2A92EAF3BFA54166122951E@SJMEMXMBS11.stjude.sjcrh.local ><[email protected]> <[email protected]> <8F77913624F7524AACD2A92EAF3BFA541661229521@SJMEMXMBS11.stjude.sjcrh.local ><[email protected]> <8F77913624F7524AACD2A92EAF3BFA541661229523@SJMEMXMBS11.stjude.sjcrh.local ><[email protected]> <8F77913624F7524AACD2A92EAF3BFA541661229526@SJMEMXMBS11.stjude.sjcrh.local ><[email protected]> <8F77913624F7524AACD2A92EAF3BFA541661229527@SJMEMXMBS11.stjude.sjcrh.local ><a06240802c848414681ef@[192.168.2.104]><[email protected]>
|
I would like to endorse Simon's view. If it ain't broke, don't fix
it. We have managed well with ASCII, and ASCII will continue to be
used for all but the text fields for a long time even is we allow
Unicode. Firstly we have to get dictionaries written, then we have to
have CIF2 compliant user programs (to take advantage of the real virtue
of CIF2, namely methods). Then we have to persuade CIF writers to
produce CIF2 files. They will not be keen to do this until their own
programs are able to read CIF2. How many years down the road have we
now gone, 10? 20? Even when programs are available for writing CIF2,
all the business end of the CIF (numerical tables) will continue to be
written in ASCII and most of the Comment and Abstract fields will be as
well, though there may be some who will find Unicode useful for
subscrripts and superscripts and names with accents. I agree with Simon that we should be looking ahead, but we should be looking ahead to the time when encoding is nt the mess that it is at the present, and when the choice will be obvious. Surely extending the character set is one that can be added later. DDLm will continue to be written in ASCII as will most private dictionaries. I think we are planning a large airport before the we have a plane that can fly. My vote is to stay with ASCII but keep extended codings in mind so that we can move when we know which way everyone else is going to move. If we choose UTF-8 now we might be backing the wrong horse and then we will be in real trouble. David James Hester wrote: Before I engage with this latest proposal, I need to pick over the statements in your first paragraph carefully, so bear with me: On Thu, Jun 24, 2010 at 12:47 AM, Herbert J. Bernstein <[email protected]> wrote:Here is an issue to consider: If we impose a non-text canonical UTF-8 encoding that does not contain an internal encoding signature, and that file is transmitted as text and not binary from a machine for which, say, ASCII with code pages for, say, western europe, is the native encoding, and the transmission converts the UTF-8 charcaters as if they were accented characters in Latin-1, then what is received may appear plausible at the receiving end, just wrong.1. 'That file is transmitted as text': what does this mean? How do I transmit a file as text as opposed to just sending the file contents with no change? What protocol am I using? Email attachment? Http upload? Http downloading a .tgz file from a website? Ftp with 'text' mode? 2. 'The transmission converts the UTF-8 characters': why would it do this? What is this advanced text transmission protocol that is so confident about altering file contents? 3. 'Native encoding': what does this mean? What would the native encoding of my computer be, with one shell window having 'LANG=ru_RU' and another 'LANG=POSIX?' Does the concept of native encoding make any sense at all at an OS level? I'm aware of filenames in filesystems being expressed in standard encodings, but not the file contents themselves. Just so you know what my mental model of this whole file transmission issue is: 1. Files in the modern computing world are virtually always transmitted without alteration of any bytes at all. Call this binary transmission if you like. I am aware that email protocols may encode to base64 etc., but this is of course to make sure every single byte is identical when it is unpacked at the end. 2. How a file is *displayed* will be application (not OS) dependent. The application may take into account environment variables, any metadata about the file, and user selections. How a file is *displayed* does not change how it is stored on disk. 3. Utilities exist to interconvert between encodings. Modern text editors do not need these tools as they come with a reasonable range of character mapping tables to enable them to *display* the correct character if they are told the correct encoding. There is no such thing as the *correct* encoding for such an application, only a default encoding. |
begin:vcard fn:I.David Brown n:Brown;I.David org:McMaster University;Brockhouse Institute for Materials Research adr:;;King St. W;Hamilton;Ontario;L8S 4M1;Canada email;internet:[email protected] title:Professor Emeritus tel;work:+905 525 9140 x 24710 tel;fax:+905 521 2773 version:2.1 end:vcard
_______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. . (Herbert J. Bernstein)
- References:
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. . (James Hester)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. . (Brian McMahon)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. ... (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. . (Herbert J. Bernstein)
- Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... . (SIMON WESTRIP)
- Prev by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .
- Next by Date: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .
- Prev by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .... .
- Next by thread: Re: [ddlm-group] options/text vs binary/end-of-line. .. .. .. .... .. .
- Index(es):

