[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [Cif2-encoding] How we wrap this up
- To: Group for discussing encoding and content validation schemes for CIF2 <cif2-encoding@xxxxxxxx>
- Subject: Re: [Cif2-encoding] How we wrap this up
- From: "Herbert J. Bernstein" <yaya@xxxxxxxxxxxxxxxxxxxxxxx>
- Date: Sat, 25 Sep 2010 14:18:54 -0400 (EDT)
- In-Reply-To: <463665.7127.qm@web87004.mail.ird.yahoo.com>
- References: <AANLkTi=hmKNFMgaeMqt69=sG6dOmxZRUrffB1khjF+mZ@mail.gmail.com><63870.31508.qm@web87006.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA5416659DEDDC@SJMEMXMBS11.stjude.sjcrh.local> <80062.82001.qm@web87012.mail.ird.yahoo.com><a06240802c8c165d79c1a@[149.72.2.188]><162941.37460.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009231729300.51637@epsilon.pair.com><780727.99055.qm@web87010.mail.ird.yahoo.com><alpine.BSF.2.00.1009232100530.35116@epsilon.pair.com><526633.3484.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009240742480.8859@epsilon.pair.com><613218.81205.qm@web87011.mail.ird.yahoo.com><8F77913624F7524AACD2A92EAF3BFA5416659DEDDE@SJMEMXMBS11.stjude.sjcrh.local><281388.90819.qm@web87012.mail.ird.yahoo.com><463665.7127.qm@web87004.mail.ird.yahoo.com>
Dear Simon, Unfortunately, that is likely to take us back into our infinite loop or into a diverging spiral. Right now, we would have UTF8 as no more or less a default for CIF2 than ASCII is for CIF1 -- i.e. a not too bad first guess as the likely default encoding for any given CIF, but not a formal constraint. I would suggest we leave the wording in that imprecise state, get CIF2 out and accepted and then work further on the encoding issue. Regards, Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Sat, 25 Sep 2010, SIMON WESTRIP wrote: > Dear all > > In the event that CIF2 adopts the 'any encoding' approach, would there be > any objections to > explicitly defining a default encoding in the specification, to be defaulted > to when there were no indications > to the contrary. At worst this would give CIF2 service providers an excuse > to interpret CIFs as e.g. UTF8 if they couldnt > determine the encoding by other means - but such intollerant service > providers would soon find that their service is > not successful - while at best this might raise awareness of the issues > regarding encoding once non-ASCII is used in > a CIF. Essentially, it does not require users to change there working > practices, which is one of the main arguments for > 'any encoding'. > > So, CIF2 would remain 'any encoding', and specifications in terms of e.g. > "Herbert's as for CIF1..." > might only require a single sentence to define the default after stating > what the 'preferred' encoding was; > the proposal might be phrased as "Herbert's as for CIF1..." + "explicit > default encoding"? > > I do not wish to prolong this debate - if there are objections I will not > launch into an endless round of exchanges > that cover the same ground that has led us this far. > > Cheers > > Simon > > > > > > > ____________________________________________________________________________ > From: SIMON WESTRIP <simonwestrip@btinternet.com> > To: Group for discussing encoding and content validation schemes for CIF2 > <cif2-encoding@iucr.org> > Sent: Friday, 24 September, 2010 20:10:13 > Subject: Re: [Cif2-encoding] How we wrap this up > > Dear James > > As you may have gathered I have been reconsidering my position on this > issue. > Please forgive me, but I would like to change my vote if that is OK, in > favour of the 'any encoding' camp. > This apparent U-turn is not a response to recent contributions; rather it is > the outcome of a meeting I had this morning > where I demonstrated some new software to the Managing Editor of IUCr > journals. > > By way of explanation: > > I have been developing a new docx template which the IUCr editorial office > is shortly to release for use by > authors. The template will be packaged with some tools to extract data from > CIFs > and tabulate them in the Word document, e.g. open an mmCIF, click a button, > and standard > tables populated with data from the CIF will be included in the document, > acting as > table templates for the author to edit as appropriate for their manuscript. > > Inclusion of the mmCIF tools is part of an unofficial policy to 'coax' > biologists to start using/accepting mmCIF > as a useful medium, rather than as a product of their deposition to the PDB, > and to encourage them to become comfortable > with passing mmCIFs between applications, and even to edit the things (in > the same way as the core-CIF community > treats CIFs). For example, our perception is that there is no reason why an > author should not feel free to take an mmCIF > that has been created by e.g. pdb_extract and populate it using third-party > software before uploading to the PDB for > deposition. > > This cause would not be furthered by effectively invalidating an mmCIF if it > were not to be encoded in one of > the specified encodings. > > So although I am uneasy about a specification that propogates uncertainty, > I'm also uneasy about alienating users, > especially when we are struggling to change their mindset as in the case of > the biological community > (my perception of the biological community's attitude to mmCIF is based on > feedback from authors/coeditors to > IUCr journals). > > Granted this may not be the most compelling argument in favour of 'any > encoding', but recognizing the hurdles that > may have to be overcome once we move beyond ASCII whatever the CIF2 > specification, I support 'any encoding' > as 'a means to an end'. > > I will not provide my preferences in terms of the numbered options until you > say so; afterall, I have already voted and > all this has to be signed off by COMCIFs in any case. > > Cheers > > Simon > > > > > ____________________________________________________________________________ > From: "Bollinger, John C" <John.Bollinger@STJUDE.ORG> > To: Group for discussing encoding and content validation schemes for CIF2 > <cif2-encoding@iucr.org> > Sent: Friday, 24 September, 2010 14:50:57 > Subject: Re: [Cif2-encoding] How we wrap this up > > Dear Simon, > > It is exactly this sort of issue that drove me to support more permissive > encoding rules and ultimately to devise the UTF-8 + UTF-16 + local proposal. > > Do please think about the considerations Herb raised. As you reconsider > your votes, I urge you also to ask yourself what, *precisely*, a "text file" > is, and to consider whether your answer is functionally different from my > "local". If you decide not, then please consider what that answer implies > about CIF2 support of UTF-8 and UTF-16 (which evidently you favor) under > each option on the table, especially for CIFs containing non-ASCII > characters. Whatever you decide about the meaning of "text file", please > consider whether reasonable people might reach a different conclusion, as I > assert they might do, and to what extent the standard needs to address that. > > > Regards, > > John > -- > John C. Bollinger, Ph.D. > Department of Structural Biology > St. Jude Children's Research Hospital > > > >From: cif2-encoding-bounces@iucr.org > [mailto:cif2-encoding-bounces@iucr.org] On Behalf Of SIMON WESTRIP > >Sent: Friday, September 24, 2010 7:53 AM > >To: Group for discussing encoding and content validation schemes for CIF2 > >Subject: Re: [Cif2-encoding] How we wrap this up. . > > > >Dear Herbert > > > >Not for the first time, I find your arguement persuasive. Brian's vote and > explanation have also raised some > >questions that I would like to look into. > > > >I will confirm or otherwise my vote as soon as possible, assuming that is > OK with James and assuming that > >this round of votes might wrap this up. > > > >Cheers > > > >Simon > > > >________________________________________ > >From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com> > >To: Group for discussing encoding and content validation schemes for CIF2 > <cif2-encoding@iucr.org> > >Sent: Friday, 24 September, 2010 13:17:14 > >Subject: Re: [Cif2-encoding] How we wrap this up > > > >If he ignores the standard, in most cases all he has to do to comply with > CIF2 is to run whatever applications he currently runs to produce CIF1 and, > perhaps, in some cases, run a minor edit pass at the end, to convert for the > minor syntactive differences and/or changed tags required to comply with > CIF2 and the new dictionaries, but he is unlikely to have to do anything to > deal with the messy business of whether his encoding is really a proper UTF8 > encoding or not. > > >The punishment if he tries to comply, is that he has to totally uproot and > reconfigure the environment in which he produces CIFs from whatever he is > currently doing to create an enviroment in which he can reliably create and, > more importantly, transmit compliant UTF8 files. This can be very tricky if > he does only a partial job, say fudging in one special application (yet to > be written), because if he stays with his old system, all kinds of tools > will keep trying to transcode whatever he has produced back to whatever his > system considers a standard. Those of us who have files, applications and > tools that have lived through several generations of macs are living proof > of the problem. Macs now have excellent UTF8/16 unicode support, but every > once in a while in working with a unicode file I find it has been strangely > and unexpectedly converted to something else, and it can be really tricky to > spot when the unaccented roman text part has been left untouched but just a > few accen > ted letters have gotten different accents. > > >Mandating UTF8 is simply trying to shift a serious software problem from > the central handlers of CIF (IUCr, PDB, etc.) to the external users. Most > users will probably have the good sense to simply ignore the demand and > leave the burden just where it is now. A few sophisticated users will > probably adapt with no trouble, but the punishment for those users who > blindly follow orders before we have a complete multiplatform supporting > infrastructure in place by mandating UTF8 is severe, expensive and > undeserved. Until and unless we have developed solid support, we will just > be alienating people from CIF. I will continue to oppose such a move. > > [...] > > > Email Disclaimer: www.stjude.org/emaildisclaimer > _______________________________________________ > cif2-encoding mailing list > cif2-encoding@iucr.org > http://scripts.iucr.org/mailman/listinfo/cif2-encoding > >
_______________________________________________ cif2-encoding mailing list cif2-encoding@iucr.org http://scripts.iucr.org/mailman/listinfo/cif2-encoding
Reply to: [list | sender only]
- Follow-Ups:
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- References:
- [Cif2-encoding] How we wrap this up (James Hester)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)
- Prev by Date: Re: [Cif2-encoding] How we wrap this up
- Next by Date: Re: [Cif2-encoding] How we wrap this up
- Prev by thread: Re: [Cif2-encoding] How we wrap this up
- Next by thread: Re: [Cif2-encoding] How we wrap this up
- Index(es):