[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [Cif2-encoding] How we wrap this up
- To: Group for discussing encoding and content validation schemes for CIF2 <cif2-encoding@xxxxxxxx>
- Subject: Re: [Cif2-encoding] How we wrap this up
- From: "Herbert J. Bernstein" <yaya@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Tue, 28 Sep 2010 15:40:46 -0400 (EDT)
- In-Reply-To: <8F77913624F7524AACD2A92EAF3BFA5416659DEDE7@SJMEMXMBS11.stjude.sjcrh.local>
- References: <AANLkTi=hmKNFMgaeMqt69=sG6dOmxZRUrffB1khjF+mZ@mail.gmail.com><alpine.BSF.2.00.1009240742480.8859@epsilon.pair.com><613218.81205.qm@web87011.mail.ird.yahoo.com><281388.90819.qm@web87012.mail.ird.yahoo.com><463665.7127.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009251413550.93269@epsilon.pair.com><262880.46378.qm@web87002.mail.ird.yahoo.com><alpine.BSF.2.00.1009251537250.57408@epsilon.pair.com><a06240800c8c5653f38cf@192.168.2.104><476110.27334.qm@web87005.mail.ird.yahoo.com><a06240805c8c6224b8789@192.168.2.104><a06240803c8c65f78a402@149.72.2.199><8F77913624F7524AACD2A92EAF3BFA5416659DEDE3@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1009271801070.86201@epsilon.pair.com><alpine.BSF.2.00.1009271900080.86201@epsilon.pair.com><AANLkTikudiXBk7orHSAH=JonoeQHeNXVrzvAZmH3Wt94@mail.gmail.com><646265.82162.qm@web87004.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA5416659DEDE7@SJMEMXMBS11.stjude.sjcrh.local>
Dear John, The norm in standards work is to deprecate features for a while (at least months and preferably years) before you remove them. > Recommending UTF-8 and / or UTF-16 without mandating support for one or > both does not get us where I insist we need to be. The problem is coming to agreement on "support" and that pesky word "mandating". Up until now in order for a CIF application developer or user to produce compliant CIFS, all they had to do was to produce a text file in whatever encoding was provided on their system. Now you wish to mandate that they be able to produce UTF8 or UTF16, even if they are running on some code-page based system. It is fine to recommend to them that they do this. It is fine to tell them that CIF compliance via some non-unicode is about to be deprecated, so they should take the issue seriously, but it is most definitely _not_ fine to mandate that they make the change _now_ because we are impatient, and don't want to go through the normal standards process of deprecating a feature before removing it. We have already made that mistake with other CIF2 features, e.g. the drastic change in string quoting. At least with the string quoting change we can easily provide portable conversion software and compensate for the discourtesy of failing to provide a transition period during which the old quoting convention is deprecated (provided we add the concatenation operator to the CIF2 spec). In addition, we have the excuse of their being a ressonable need to harmonize the string quotation process between the CIF itself and DDLm to have dREL fucntion smoothly. What is our excuse for impatience on the encoding issue? We don't have any reasonable prospect to clean software support for the encoding transition. We don't even have a coherent document yet. John, you seem to be doing a lot of "insisting" and "requiring". Precisely which existing CIF workflows are going to be negatively impacted if your demands are not met? What problem is being solved? The motion I have proposed does not make anything worse for anybody currently using CIF and allows them to start moving into CIF2 now. Your approach imposes conditions it will take months or years to meet with no prospect that satisfying your demands will solve any problem for anybody. Please rethink your position. If we recommend UTF8/UTF16 support we have a decent chance that somebody will simply provide it. If we mandate UTF8/UTF16 support we force pointless delays in the adoption of the rest of CIF2 and gain what in exchange? Regards, Herbert P.S. A partial answer to your question about text encodings is at http://en.wikipedia.org/wiki/Character_encoding However, the real answer (not a joke) is that a text encoding is whatever the formatted I/O system in a fortran compiler on the system under discussion reads and writes or the format of a COBOL EBCDIC-sequential file or a COBOL ASCII line-sequential file, or what a text editor on the system handles. That is the point -- text is something very, very system and language dependent. The strange thing is that text files have a much longer practical survival time than binary files, as backwards as that may seem, because there is a much larger investment in ensuring the continued readbility of text files than of binary files. ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Tue, 28 Sep 2010, Bollinger, John C wrote: > > On Tuesday, September 28, 2010 4:54 AM, SIMON WESTRIP wrote: > [...] >> So I think the 'As for CIF1...' proposals with this explicit default >> encoding is certainly heading towards a workable compromise. Herbert is >> unhappy to mandate a particular encoding for non-ASCII use, but has >> agreed to recommend UTF8 and UTF16 in such cases. Such recommendations >> along with a default encoding that should be adopted in the absence of >> any pointers to the contrary could boil down to UTF8/16 + local in all >> intents and purposes, and could boil down to UTF8/16 if you want to use >> non-ASCII text. > > Recommending UTF-8 and / or UTF-16 without mandating support for one or > both does not get us where I insist we need to be. In particular, the > point of requiring support for at least one specific encoding applicable > to the entire CIF2 character repertoire is to provide a means *wholly > within the standard* by which conforming parties can be certain of > communicating arbitrary CIF content accurately. The various Unicode > Transformation Formats have additional desirable properties in that > regard that we have covered extensively. > > If establishing UTF-8 as the default encoding confers a mandate to > support it then where indeed is the great distinction between 'As for > CIF1...' and UTF-8 (+- UTF-16) + local? If there is one then it can > only be in the definition of "text" on the one hand and "local" on the > other, which is to say in the details of support for non-UTF-x > encodings. That is an area where perhaps we could find a consensus, or > at least a strong majority opinion. For that to happen, I require > definitions of "text" and "text file" sufficient to program to. James > has asked for the same. "local" already provides such definitions, > intended to cover the cases that CIF1 allows and UTF-8 +- UTF-16 does > not. Are there cases it misses that should be covered? Are there cases > it covers that should be missed? > > > Regards, > > John > -- > John C. Bollinger, Ph.D. > Department of Structural Biology > St. Jude Children's Research Hospital > > > Email Disclaimer: www.stjude.org/emaildisclaimer > _______________________________________________ > cif2-encoding mailing list > cif2-encoding@iucr.org > http://scripts.iucr.org/mailman/listinfo/cif2-encoding > _______________________________________________ cif2-encoding mailing list cif2-encoding@iucr.org http://scripts.iucr.org/mailman/listinfo/cif2-encoding
Reply to: [list | sender only]
- Follow-Ups:
- Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)
- References:
- [Cif2-encoding] How we wrap this up (James Hester)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (James Hester)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)
- Prev by Date: Re: [Cif2-encoding] How we wrap this up
- Next by Date: Re: [Cif2-encoding] How we wrap this up
- Prev by thread: Re: [Cif2-encoding] How we wrap this up
- Next by thread: Re: [Cif2-encoding] How we wrap this up
- Index(es):