[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [Cif2-encoding] How we wrap this up
- To: Group for discussing encoding and content validation schemes for CIF2 <cif2-encoding@xxxxxxxx>
- Subject: Re: [Cif2-encoding] How we wrap this up
- From: "Herbert J. Bernstein" <yaya@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Mon, 27 Sep 2010 12:23:32 -0400
- In-Reply-To: <223950.86835.qm@web87002.mail.ird.yahoo.com>
- References: <AANLkTi=hmKNFMgaeMqt69=sG6dOmxZRUrffB1khjF+mZ@mail.gmail.com><162941.37460.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009231729300.51637@epsilon.pair.com><780727.99055.qm@web87010.mail.ird.yahoo.com><alpine.BSF.2.00.1009232100530.35116@epsilon.pair.com><526633.3484.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009240742480.8859@epsilon.pair.com><613218.81205.qm@web87011.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA5416659DEDDE@SJMEMXMBS11.stjude.sjcrh.local> <281388.90819.qm@web87012.mail.ird.yahoo.com><463665.7127.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009251413550.93269@epsilon.pair.com><262880.46378.qm@web87002.mail.ird.yahoo.com><alpine.BSF.2.00.1009251537250.57408@epsilon.pair.com><206078.51827.qm@web87010.mail.ird.yah oo.com><a06240800c8c5653f38cf@[192.168.2.104]><476110.27334.qm@web87005.mail.ird.yahoo.com><a06240805c8c6224b8789@[192.168.2.104]><93847.2110.qm@web87014.mail.ird.yahoo.com> <alpine.BSF.2.00.1009270929140.95382@epsilon.pair.com><223950.86835.qm@web87002.mail.ird.yahoo.com>
Dear Simon, We do not seem to be communicating effectively. Do you have a Skype account? We really need a meeting. Regards, Herbert At 3:27 PM +0000 9/27/10, SIMON WESTRIP wrote: >I see nothing wrong with a strategy to introduce CIF2 if necessary. >My initial thoughts are that the current 'as for CIF1...' description >is not best suited as base specification on which to build full >unicode support, should such a strategy be pursued. > >However, I will reflect on this along with recent contributions from >James and John... > >Cheers > >Simon > > > >From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com> >To: Group for discussing encoding and content validation schemes for >CIF2 <cif2-encoding@iucr.org> >Sent: Monday, 27 September, 2010 14:45:16 >Subject: Re: [Cif2-encoding] How we wrap this up > >The problem is that options 3,4 and 5 specifically prescribe the >use of Unicode characters (that is the entire point of those >options -- and that is the point in dispute -- whether we should >be prescribing UTF8 or using is as we now use ASCII, as a way to >be clear what we are talking about as in CIF1) and we simply are not >ready to deal such a requirement yet. > >I take the blame for starting this discussion many years ago when >I simply asked for just what my motion says, that we start using >UTF8 in the same way we had been using ASCII. Unfortunately >this discussion has turned into a strong push to focus CIF on >that particular encoding, stop using Brian's elides, etc. With >the current weak state of software support for CIF and the large >investment at the IUCr and at the PDB in current workflows, I >think it would be a very disruptive and expensive change to make >right now. God and the Devil are in the details. > >Note that I am _not_ basing this argument on imgCIF. At this point >it appears, unfortunately, that CIF2 and imgCIF will have to diverge. >If we have enough face-to-face discussions, perhaps we can bring >them together again, as we did in 1998, but that is an even more >difficult discussion than the one we need to have on encodings. >What is I we will do is to go at this in incremental stages: > >1. Make the transition from CIF1 to CIF2 using new dictionaries >but allowing most data files to remain unchanges, and providing >simple algorithmic transformations for the rest, but keeping >most of the current semantic extensions that we have in CIF1, >focusing our enegry on getting the new dictionaries used and >making use of dREL; > >2. Work on a CIF2.1 that, by creative and well-supported use >of Unicode, allows for a well organized transition from Brian's >elides to use of Unicode characters > >3. Then working in that context, whatever it turns out to be, >work on having imgCIF make the transition to CIF2 in some >reasonably compatible way. > >I see how to do item 1 for next summer. I don't see how to do 2 and >3 in that time frame, though I am sure we could make a dent in >them if we could meet face to face. email tends to stiffen too >many positions. > >Regards, > Herbert > >===================================================== >Herbert J. Bernstein, Professor of Computer Science > Dowling College, Kramer Science Center, KSC 121 > Idle Hour Blvd, Oakdale, NY, 11769 > > +1-631-244-3035 > <mailto:yaya@dowling.edu>yaya@dowling.edu >===================================================== > >On Mon, 27 Sep 2010, SIMON WESTRIP wrote: > >> Dear Herbert >> >> I do not understand why it is *only* options 3, 4 or 5 that allow users to >> start using >> unicode characters? >> >> More generally, are you suggesting that the use of anything but ASCII in a >> data value is only allowed if >> e.g. the dictionary definition of the data item permits, or even only if the >> IUCr says that's OK? >> >> Fundamentally, I'm starting to infer that the purpose of the 'as for > > CIF1...' approach to encoding is >> to open the door to full unicode support, but not actually let anyone cross >> the threshold? >> >> >> Cheers >> >> Simon >> >> ____________________________________________________________________________ >> From: Herbert J. Bernstein >><<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com> >> To: Group for discussing encoding and content validation schemes for CIF2 >> <<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> >> Sent: Monday, 27 September, 2010 11:48:49 >> Subject: Re: [Cif2-encoding] How we wrap this up >> >> Dear Simon, >> >> Under the CIF2 specification with UTF8 in place of ASCII there is >> _no_ change in the use of elided ASCII sequences to represent non-ASCII >> characters until and unless the IUCr publications office decides that, >> for that particular application, they are ready to accept something >> new. >> >> It is _only_ if you go forward with options 3, 4 or 5 that you >> are giving the green light to users to do precisely what you are >> concerned about -- using the unicode characters instead instead >> in possibly strange admixtures that nobody is ready to process. >> >> Remember, under the CIF2 specification as now written, it is >> _not_ part of the CIF2 specification to determine the handling >> of the characters in quoted strings other than to ensure that >> those string do not contain illegal characters from the point >> of view of CIF2. Dealing with the validity of particular character >> sequences in strings users provide is, just as in CIF1, the >> responsibility of the application (i.e. the IUCr journal flows >> or the PDB archiving flows). >> >> My apologies to James, who I know is trying to do what he believes >> to be right, but I believe James has things backwards -- the "deep >> breath" is provided by my proposal -- taking the time to properly engineer >> the use of the extra characters UTF8 allows us to discuss clearly, >> while James' push for an immediate prescriptive use of UTF8 with >> prescriptions that differ drastically from what has been adopted >> by all other frameworks (HTML, XML, python, etc.) in ways that >> are untested and unsupported by most existing software is >> the untimely rush to judgement. >> >> I beg you to support options 1 and/or 2 to allow CIF2 to go forward >> in all other respects while we all take a deep breath and deal >> with the tricky issue you raised slowly and carefully without the >> pressure of trying to have CIF2 itself ready for next summer. >> >> Regards, >> Herbert >> >> At 9:34 AM +0000 9/27/10, SIMON WESTRIP wrote: >> >I was not so concerned about invalidating existing CIFs, or even the >> >likelihood >> >that users will continue to write e.g. 'f\'oo' - this is a syntax >> >error in CIF2 that is readily recoverable. >> > >> >Rather there is a large group of CIF1 users that are in the habit of >> >using elided ASCII sequences to >> >represent non-ASCII characters. With CIF2 these users will be able >> >to use the unicode character itself. >> >So we might end up with a mixture of esacaped sequences and unicode >> >characters (e.g. a user may have a keyboard shortcut >> >for an accented character that forms part of their name, but might >> >still resort to \a for alpha, under the assumption that \a is still >> >valid because CIF2 is basically the same as CIF1, and, rightly or >> >wrongly, they perceive the eliding machanism as part of >> >CIF syntax. >> > >> >I think this is an issue where we can't afford to take an 'as for >> >CIF1...' approach, especially as the CIF1 specification >> >isn't entirely satisfactory (e.g. there's an example in the >> >line-folding protocal that uses elides in a file path to make a >> >point, >> >but actually these elides may easily be interpretted as escape >> >sequences), and as the encoding issue is very much concerned with >> >user practice, the large group of users that currently use elided >> >character codes need to be aware what the situation is in >> >CIF2? >> > >> >I'm not convinced this issue should be left for discussion later; >> >it is relevant when considering how the move beyond ASCII is specified. >> > >> >Cheers > > > >> >Simon >> > >> > >> > >> > >> >From: Herbert J. Bernstein >><<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com> >> >To: Group for discussing encoding and content validation schemes for >> >CIF2 <<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> >> >Sent: Sunday, 26 September, 2010 23:14:55 >> >Subject: Re: [Cif2-encoding] How we wrap this up >> > >> >Dear Simon, >> > >> > The current CIF2 spec, with or without the changes I have suggested >> >to temporarily resolve the encoding issue is at best vague and >> >confusing on the elide character issue. The interacting issue on >> >which the CIF2 spec >> >is clear is that we are changing the handling of quoted strings so >> >that they end on the first occurrence of the quoting character and leaves >> >the handling of elides to the calling application. >> > >> > This will be a problem -- the change from CIF1 in the termination of >> >quoted strings along with the absence of a way of eliding the quotes >> >will invalidate a significant number of existing CIFS without any simple >> >mechanism to recover. Rather than reopen another endless discussion, >> >I would suggest we simply add the python string concatenation character >> >"+" to ensure we can map all current CIF1 files and use Brian's common >> >semantic features for the moment. We can then deal with the full elides >> >discussion at a future date. >> > >> > Regards, >> > Herbert >> > >> > >> > >> > >> > >> >At 1:40 PM -0700 9/26/10, SIMON WESTRIP wrote: >> >>Dear all >> >> >> >>While reviewing my hypothetical 'to do' list for implementing CIF2 >> >>in current software, I realized that >> >>the issue of current support for elided character codes hasnt really >> >>been addressed in the context of CIF2. >> >>My 'to do' list contains notes that software could treat them as >> >>keyboard shortcuts, and their use could be >> >>defined in the dictionary. However, that was based on a distinct >> >>difference between CIF1 and CIF2, >> >>while the current arguments for 'as for CIF1...' suggest that the >> >>distinction between CIF1 and CIF2 >> >>should almost be imperceptible. >> >> >> >>How is this issue to be addressed in the specification? >> >> >> >>Cheers >> >> >> >>Simon >> >> >> >> >> >> >> >>From: Herbert J. Bernstein >> >><<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com> >> >>To: Group for discussing encoding and content validation schemes for >> >>CIF2 >><<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> >> >>Sent: Saturday, 25 September, 2010 20:37:46 >> >>Subject: Re: [Cif2-encoding] How we wrap this up >> >> >> >>Thank you for your cooperation. -- Herbert >> >> >> >>===================================================== >> >>Herbert J. Bernstein, Professor of Computer Science >> >> Dowling College, Kramer Science Center, KSC 121 >> >> Idle Hour Blvd, Oakdale, NY, 11769 >> >> >> >> +1-631-244-3035 >> >> >> >><mailto:<mailto:<mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:<mailto:yaya@dowling.ed>yaya@dowling.ed >> u><mailto:yaya@dowling.edu>yaya@dowling.edu >> >>===================================================== >> >> >> >>On Sat, 25 Sep 2010, SIMON WESTRIP wrote: >> >> >> >>> OK - as promised, I wont pursue the matter :-) >> >>> >> >>> >> >>> >> >>>________________________________________________________________________ >> ____ >> >>> From: Herbert J. Bernstein >> >>><<mailto:<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.c>yaya@bernstein-plus-sons.c >> >>om><mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com> >> >>> To: Group for discussing encoding and content validation schemes for >> CIF2 >> >>> >> >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c >> >><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> > > >>> Sent: Saturday, 25 September, 2010 19:18:54 >> >>> Subject: Re: [Cif2-encoding] How we wrap this up >> >>> >> >>> Dear Simon, >> >>> >> >>> Unfortunately, that is likely to take us back into our infinite loop >> or >> >>> into a diverging spiral. Right now, we would have UTF8 as no >> >>>more or less a >> >>> default for CIF2 than ASCII is for CIF1 -- i.e. a not too bad >> >>>first guess as >> >>> the likely default encoding for any given CIF, but not a formal >> >>>constraint. >> >>> I would suggest we leave the wording in that imprecise state, get CIF2 >> out >> >>> and accepted and then work further on the encoding issue. >> >>> >> >>> Regards, >> >>> Herbert >> >>> >> >>> ===================================================== >> >>> Herbert J. Bernstein, Professor of Computer Science >> >>> Dowling College, Kramer Science Center, KSC 121 >> >>> Idle Hour Blvd, Oakdale, NY, 11769 >> >>> >> >>> +1-631-244-3035 >> >>> >> >>><mailto:<mailto:<mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:<mailto:yaya@dowling.e>yaya@dowling.e >> du><mailto:yaya@dowling.edu>yaya@dowling.edu >> >>> ===================================================== >> >>> >> >>> On Sat, 25 Sep 2010, SIMON WESTRIP wrote: >> >>> >> >>> > Dear all >> >>> > >> >>> > In the event that CIF2 adopts the 'any encoding' approach, >> >>>would there be >> >> > > any objections to >> > >> > explicitly defining a default encoding in the specification, to be >> >>> defaulted >> >>> > to when there were no indications >> >>> > to the contrary. At worst this would give CIF2 service >> >>>providers an excuse >> >>> > to interpret CIFs as e.g. UTF8 if they couldnt >> >>> > determine the encoding by other means - but such intollerant service >> >>> > providers would soon find that their service is >> >>> > not successful - while at best this might raise awareness of the >> issues >> >>> > regarding encoding once non-ASCII is used in >> >>> > a CIF. Essentially, it does not require users to change there working >> >>> > practices, which is one of the main arguments for >> >>> > 'any encoding'. >> >>> > >> >>> > So, CIF2 would remain 'any encoding', and specifications in >> >>>terms of e.g. >> >>> > "Herbert's as for CIF1..." >> >>> > might only require a single sentence to define the default after >> stating >> >>> > what the 'preferred' encoding was; >> >>> > the proposal might be phrased as "Herbert's as for CIF1..." + >> "explicit >> >>> > default encoding"? >> >>> > >> >>> > I do not wish to prolong this debate - if there are objections >> >>>I will not >> >>> > launch into an endless round of exchanges >> >>> > that cover the same ground that has led us this far. >> >>> > >> >>> > Cheers >> >>> > >> >>> > Simon >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> > >> >>> >> >>>>_______________________________________________________________________ >> ____ >> >>> _ >> >>> > From: SIMON WESTRIP >> >>><<mailto:<mailto:<mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com><mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com >> ><mailto:<mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com><mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com> >> >>> > To: Group for discussing encoding and content validation >> >>>schemes for CIF2 >> >>> > >> >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c >> >><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> >> >>> > Sent: Friday, 24 September, 2010 20:10:13 >> >>> > Subject: Re: [Cif2-encoding] How we wrap this up >> >>> > >> >>> > Dear James >> >>> > >> >>> > As you may have gathered I have been reconsidering my position on >> this >> >>> > issue. >> >>> > Please forgive me, but I would like to change my vote if that is OK, >> in >> >>> > favour of the 'any encoding' camp. >> >>> > This apparent U-turn is not a response to recent >> >>>contributions; rather it >> >>> is >> >>> > the outcome of a meeting I had this morning > > >>> > where I demonstrated some new software to the Managing >Editor of IUCr >> >>> > journals. >> >>> > >> >>> > By way of explanation: >> >>> > >> >>> > I have been developing a new docx template which the IUCr >> >>>editorial office >> >>> > is shortly to release for use by >> >>> > authors. The template will be packaged with some tools to extract >> data >> >>> from >> >>> > CIFs >> >>> > and tabulate them in the Word document, e.g. open an mmCIF, click a >> >>> button, >> >>> > and standard >> >>> > tables populated with data from the CIF will be included in >> >>>the document, >> >>> > acting as >> >>> > table templates for the author to edit as appropriate for their >> >>> manuscript. >> >>> > >> >>> > Inclusion of the mmCIF tools is part of an unofficial policy to >> 'coax' >> >>> > biologists to start using/accepting mmCIF >> >>> > as a useful medium, rather than as a product of their deposition to >> the >> >>> PDB, >> >>> > and to encourage them to become comfortable >> >>> > with passing mmCIFs between applications, and even to edit the >> >>>things (in >> >>> > the same way as the core-CIF community >> >>> > treats CIFs). For example, our perception is that there is no reason >> why >> >>> an >> >>> > author should not feel free to take an mmCIF >> >>> > that has been created by e.g. pdb_extract and populate it using >> >>> third-party >> >>> > software before uploading to the PDB for >> >>> > deposition. >> >>> > >> >>> > This cause would not be furthered by effectively invalidating >> >>>an mmCIF if >> >>> it >> >>> > were not to be encoded in one of >> >>> > the specified encodings. >> >>> > >> >>> > So although I am uneasy about a specification that propogates >> >>>uncertainty, >> >>> > I'm also uneasy about alienating users, >> >>> > especially when we are struggling to change their mindset as in the >> case >> >>> of >> >>> > the biological community >> >>> > (my perception of the biological community's attitude to mmCIF >> >>>is based on >> >>> > feedback from authors/coeditors to >> >>> > IUCr journals). >> >>> > >> > >> > Granted this may not be the most compelling argument in favour of >> 'any >> >>> > encoding', but recognizing the hurdles that >> >>> > may have to be overcome once we move beyond ASCII whatever the CIF2 >> >>> > specification, I support 'any encoding' >> >>> > as 'a means to an end'. >> >>> > >> >>> > I will not provide my preferences in terms of the numbered options >> until >> >> > you >> >>> > say so; afterall, I have already voted and >> >>> > all this has to be signed off by COMCIFs in any case. >> >>> > >> >>> > Cheers >> >>> > >> >>> > Simon >> >>> > >> >>> > >> >>> > >> >>> > >> >>> >> >>>>_______________________________________________________________________ >> ____ >> >>> _ >> >>> > From: "Bollinger, John C" >> >>><<mailto:<mailto:<mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG><mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG><ma >> >>ilto:<mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG><mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG> >> >>> > To: Group for discussing encoding and content validation >> >>>schemes for CIF2 >> >>> > >> >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c >> >><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> >> >>> > Sent: Friday, 24 September, 2010 14:50:57 >> >>> > Subject: Re: [Cif2-encoding] How we wrap this up >> >>> > >> >>> > Dear Simon, >> >>> > >> >>> > It is exactly this sort of issue that drove me to support more >> >>>permissive >> >>> > encoding rules and ultimately to devise the UTF-8 + UTF-16 + local >> >>> proposal. >> >>> > >> >>> > Do please think about the considerations Herb raised. As you >> reconsider >> >>> > your votes, I urge you also to ask yourself what, *precisely*, a >> "text >> >>> file" >> >>> > is, and to consider whether your answer is functionally >> >>>different from my >> >>> > "local". If you decide not, then please consider what that >> >>>answer implies > > >>> > about CIF2 support of UTF-8 and UTF-16 (which evidently you favor) >> under >> >>> > each option on the table, especially for CIFs containing non-ASCII >> >>> > characters. Whatever you decide about the meaning of "text >> >>>file", please >> >>> > consider whether reasonable people might reach a different >> >>>conclusion, as >> >>> I >> >>> > assert they might do, and to what extent the standard needs to >> address >> >>> that. >> >>> > >> >>> > >> >>> > Regards, >> >>> > >> >>> > John >> >>> > -- >> >>> > John C. Bollinger, Ph.D. >> >>> > Department of Structural Biology >> >>> > St. Jude Children's Research Hospital >> >>> > >> >>> > >> >>> > >From: >> >>><mailto:<mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iuc >> >>r.org><mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org><mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org >> >> >>> > >> >>>[mailto:<mailto:<mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org>cif2-encoding-bou >> >><mailto:nces@iucr.org>nces@iucr.org><mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@ >> iucr.org] >> >>>On Behalf Of SIMON WESTRIP >> >>> > >Sent: Friday, September 24, 2010 7:53 AM >> >>> > >To: Group for discussing encoding and content validation >> >>>schemes for CIF2 >> >>> > >Subject: Re: [Cif2-encoding] How we wrap this up. . >> >>> > > >> >>> > >Dear Herbert >> >>> > > >> >>> > >Not for the first time, I find your arguement persuasive. Brian's >> vote >> >>> and >> >>> > explanation have also raised some >> >>> > >questions that I would like to look into. >> >>> > > >> >>> > >I will confirm or otherwise my vote as soon as possible, >> >>>assuming that is >> >>> > OK with James and assuming that >> >>> > >this round of votes might wrap this up. >> >>> > > >> >>> > >Cheers >> >>> > > >> >>> > >Simon >> >>> > > >> >>> > >________________________________________ >> >>> > >From: Herbert J. Bernstein >> >>><<mailto:<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.c>yaya@bernstein-plus-sons.c >> >>om><mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com> >> >>> > >To: Group for discussing encoding and content validation >> >>>schemes for CIF2 >> >>> > >> >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c >> >><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org> >> >>> > >Sent: Friday, 24 September, 2010 13:17:14 >> >>> > >Subject: Re: [Cif2-encoding] How we wrap this up >> >>> > > >> >>> > >If he ignores the standard, in most cases all he has to do to >> >>>comply with >> >>> > CIF2 is to run whatever applications he currently runs to produce >> CIF1 >> >>> and, >> >>> > perhaps, in some cases, run a minor edit pass at the end, to convert >> for >> >>> the >> >>> > minor syntactive differences and/or changed tags required to comply >> with >> >>> > CIF2 and the new dictionaries, but he is unlikely to have to do >> anything >> > >> to >> >>> > deal with the messy business of whether his encoding is really a >> proper >> >>> UTF8 >> >>> > encoding or not. >> >>> > >> >>> > >The punishment if he tries to comply, is that he has to totally >> uproot >> >>> and >> >>> > reconfigure the environment in which he produces CIFs from >> >>>whatever he is >> >>> > currently doing to create an enviroment in which he can reliably >> create >> >>> and, >> >>> > more importantly, transmit compliant UTF8 files. This can be >> >>>very tricky >> >>> if >> >>> > he does only a partial job, say fudging in one special >> >>>application (yet to >> >>> > be written), because if he stays with his old system, all kinds of >> tools >> >>> > will keep trying to transcode whatever he has produced back to >> whatever >> >>> his >> >>> > system considers a standard. Those of us who have files, >> >>>applications and > > >>> > tools that have lived through several generations of macs are >> >>>living proof >> >>> > of the problem. Macs now have excellent UTF8/16 unicode >> >>>support, but every >> >> > > once in a while in working with a unicode file I find it has been >> >>> strangely >> >>> > and unexpectedly converted to something else, and it can be >> >>>really tricky >> >>> to >> >>> > spot when the unaccented roman text part has been left >> >>>untouched but just >> >>> a >> >>> > few accen >> >>> > ted letters have gotten different accents. >> >>> > >> >>> > >Mandating UTF8 is simply trying to shift a serious software >> >>>problem from >> >>> > the central handlers of CIF (IUCr, PDB, etc.) to the external >> >>>users. Most >> >>> > users will probably have the good sense to simply ignore the demand >> and >> >>> > leave the burden just where it is now. A few sophisticated users >> will >> >>> > probably adapt with no trouble, but the punishment for those users >> who >> >>> > blindly follow orders before we have a complete multiplatform >> supporting >> >>> > infrastructure in place by mandating UTF8 is severe, expensive and >> >>> > undeserved. Until and unless we have developed solid support, we >> will >> >>> just >> >>> > be alienating people from CIF. I will continue to oppose such a >> move. >> >>> > >> >>> > [...] >> >>> > >> >>> > >> >>> > Email Disclaimer: >> >>><<<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclaimer><http://www.stjude.org/emaildiscl>http://www.stjude.org/emaildiscl >> >>aimer><<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclaimer><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer >> >> >>> > _______________________________________________ >> >>> > cif2-encoding mailing list >> >>> > >> >>><mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:ci >> >><mailto:f2-encoding@iucr.org>f2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org >> >>> > >> >>><<<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts>http://scripts. >> >>iucr.org/mailman/listinfo/cif2-encoding><<http://scripts.iucr.org/mailman/li>http://scripts.iucr.org/mailman/li >> >>stinfo/cif2-encoding><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding >> >> >>> > >> >>> > >> >>> >> >>> >> >> >> >>_______________________________________________ >> >>cif2-encoding mailing list >> >><mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org >> >><<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts.iu>http://scripts.iu >> cr.org/mailman/listinfo/cif2-encoding >> > >> > >> >-- >> >===================================================== >> > Herbert J. Bernstein, Professor of Computer Science >> > Dowling College, Kramer Science Center, KSC 121 >> > Idle Hour Blvd, Oakdale, NY, 11769 >> > >> > +1-631-244-3035 >> > >><mailto:<mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:yaya@dowling.edu>yaya@dowling.edu >> >===================================================== >> >_______________________________________________ >> >cif2-encoding mailing list >> ><mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org >> ><<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts.iuc>http://scripts.iuc >> r.org/mailman/listinfo/cif2-encoding >> > >> > >> >_______________________________________________ >> >cif2-encoding mailing list >> ><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org >> ><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding >> >> >> -- >> ===================================================== >> Herbert J. Bernstein, Professor of Computer Science >> Dowling College, Kramer Science Center, KSC 121 > > Idle Hour Blvd, Oakdale, NY, 11769 >> >> +1-631-244-3035 >> <mailto:yaya@dowling.edu>yaya@dowling.edu >> ===================================================== >> _______________________________________________ >> cif2-encoding mailing list >> <mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org >> >><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding >> >> > >_______________________________________________ >cif2-encoding mailing list >cif2-encoding@iucr.org >http://scripts.iucr.org/mailman/listinfo/cif2-encoding -- ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== _______________________________________________ cif2-encoding mailing list cif2-encoding@iucr.org http://scripts.iucr.org/mailman/listinfo/cif2-encoding
Reply to: [list | sender only]
- Follow-Ups:
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- References:
- [Cif2-encoding] How we wrap this up (James Hester)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Prev by Date: Re: [Cif2-encoding] How we wrap this up
- Next by Date: Re: [Cif2-encoding] How we wrap this up
- Prev by thread: Re: [Cif2-encoding] How we wrap this up
- Next by thread: Re: [Cif2-encoding] How we wrap this up
- Index(es):