Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] How we wrap this up

Dear Simon,

   We do not seem to be communicating effectively.  Do you have a
Skype account?  We really need a meeting.

   Regards,
     Herbert


At 3:27 PM +0000 9/27/10, SIMON WESTRIP wrote:
>I see nothing wrong with a strategy to introduce CIF2 if necessary.
>My initial thoughts are that the current 'as for CIF1...' description
>is not best suited as base specification on which to build full
>unicode support, should such a strategy be pursued.
>
>However, I will reflect on this along with recent contributions from
>James and John...
>
>Cheers
>
>Simon
>
>
>
>From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
>To: Group for discussing encoding and content validation schemes for 
>CIF2 <cif2-encoding@iucr.org>
>Sent: Monday, 27 September, 2010 14:45:16
>Subject: Re: [Cif2-encoding] How we wrap this up
>
>The problem is that options 3,4 and 5 specifically prescribe the
>use of Unicode characters (that is the entire point of those
>options -- and that is the point in dispute -- whether we should
>be prescribing UTF8 or using is as we now use ASCII, as a way to
>be clear what we are talking about as in CIF1) and we simply are not 
>ready to deal such a requirement yet.
>
>I take the blame for starting this discussion many years ago when
>I simply asked for just what my motion says, that we start using
>UTF8 in the same way we had been using ASCII.  Unfortunately
>this discussion has turned into a strong push to focus CIF on
>that particular encoding, stop using Brian's elides, etc.  With
>the current weak state of software support for CIF and the large
>investment at the IUCr and at the PDB in current workflows, I
>think it would be a very disruptive and expensive change to make
>right now.  God and the Devil are in the details.
>
>Note that I am _not_ basing this argument on imgCIF.  At this point
>it appears, unfortunately, that CIF2 and imgCIF will have to diverge.
>If we have enough face-to-face discussions, perhaps we can bring
>them together again, as we did in 1998, but that is an even more
>difficult discussion than the one we need to have on encodings.
>What is I we will do is to go at this in incremental stages:
>
>1.  Make the transition from CIF1 to CIF2 using new dictionaries
>but allowing most data files to remain unchanges, and providing
>simple algorithmic transformations for the rest, but keeping
>most of the current semantic extensions that we have in CIF1,
>focusing our enegry on getting the new dictionaries used and
>making use of dREL;
>
>2.  Work on a CIF2.1 that, by creative and well-supported use
>of Unicode, allows for a well organized transition from Brian's
>elides to use of Unicode characters
>
>3.  Then working in that context, whatever it turns out to be,
>work on having imgCIF make the transition to CIF2 in some
>reasonably compatible way.
>
>I see how to do item 1 for next summer.  I don't see how to do 2 and
>3 in that time frame, though I am sure we could make a dent in
>them if we could meet face to face.  email tends to stiffen too
>many positions.
>
>Regards,
>   Herbert
>
>=====================================================
>Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
>
>                 +1-631-244-3035
>                 <mailto:yaya@dowling.edu>yaya@dowling.edu
>=====================================================
>
>On Mon, 27 Sep 2010, SIMON WESTRIP wrote:
>
>>  Dear Herbert
>>
>>  I do not understand why it is *only* options 3, 4 or 5 that allow users to
>>  start using
>>  unicode characters?
>>
>>  More generally, are you suggesting that the use of anything but ASCII in a
>>  data value is only allowed if
>>  e.g. the dictionary definition of the data item permits, or even only if the
>>  IUCr says that's OK?
>>
>>  Fundamentally, I'm starting to infer that the purpose of the 'as for
>  > CIF1...' approach to encoding is
>>  to open the door to full unicode support, but not actually let anyone cross
>>  the threshold?
>>
>>
>>  Cheers
>>
>>  Simon
>>
>>  ____________________________________________________________________________
>>  From: Herbert J. Bernstein 
>><<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>>  To: Group for discussing encoding and content validation schemes for CIF2
>>  <<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>>  Sent: Monday, 27 September, 2010 11:48:49
>>  Subject: Re: [Cif2-encoding] How we wrap this up
>>
>>  Dear Simon,
>>
>>    Under the CIF2 specification with UTF8 in place of ASCII there is
>>  _no_ change in the use of elided ASCII sequences to represent non-ASCII
>>  characters until and unless the IUCr publications office decides that,
>>  for that particular application, they are ready to accept something
>>  new.
>>
>>    It is _only_ if you go forward with options 3, 4 or 5 that you
>>  are giving the green light to users to do precisely what you are
>>  concerned about -- using the unicode characters instead instead
>>  in possibly strange admixtures that nobody is ready to process.
>>
>>    Remember, under the CIF2 specification as now written, it is
>>  _not_ part of the CIF2 specification to determine the handling
>>  of the characters in quoted strings other than to ensure that
>>  those string do not contain illegal characters from the point
>>  of view of CIF2.  Dealing with the validity of particular character
>>  sequences in strings users provide is, just as in CIF1, the
>>  responsibility of the application (i.e. the IUCr journal flows
>>  or the PDB archiving flows).
>>
>>    My apologies to James, who I know is trying to do what he believes
>>  to be right, but I believe James has things backwards -- the "deep
>>  breath" is provided by my proposal -- taking the time to properly engineer
>>  the use of the extra characters UTF8 allows us to discuss clearly,
>>  while James' push for an immediate prescriptive use of UTF8 with
>>  prescriptions that differ drastically from what has been adopted
>>  by all other frameworks (HTML, XML, python, etc.) in ways that
>>  are untested and unsupported by most existing software is
>>  the untimely rush to judgement.
>>
>>    I beg you to support options 1 and/or 2 to allow CIF2 to go forward
>>  in all other respects while we all take a deep breath and deal
>>  with the tricky issue you raised slowly and carefully without the
>>  pressure of trying to have CIF2 itself ready for next summer.
>>
>>    Regards,
>>      Herbert
>>
>>  At 9:34 AM +0000 9/27/10, SIMON WESTRIP wrote:
>>  >I was not so concerned about invalidating existing CIFs, or even the
>>  >likelihood
>>  >that users will continue to write e.g. 'f\'oo' - this is a syntax
>>  >error in CIF2 that is readily recoverable.
>>  >
>>  >Rather there is a large group of CIF1 users that are in the habit of
>>  >using elided ASCII sequences to
>>  >represent non-ASCII characters. With CIF2 these users will be able
>>  >to use the unicode character itself.
>>  >So we might end up with a mixture of esacaped sequences and unicode
>>  >characters (e.g. a user may have a keyboard shortcut
>>  >for an accented character that forms part of their name, but might
>>  >still resort to \a for alpha, under the assumption that \a is still
>>  >valid because CIF2 is basically the same as CIF1, and, rightly or
>>  >wrongly, they perceive the eliding machanism as part of
>>  >CIF syntax.
>>  >
>>  >I think this is an issue where we can't afford to take an 'as for
>>  >CIF1...' approach, especially as the CIF1 specification
>>  >isn't entirely satisfactory (e.g. there's an example in the
>>  >line-folding protocal that uses elides in a file path to make a
>>  >point,
>>  >but actually these elides may easily be interpretted as escape
>>  >sequences), and as the encoding issue is very much concerned with
>>  >user practice, the large group of users that currently use elided
>>  >character codes need to be aware what the situation is in
>>  >CIF2?
>>  >
>>  >I'm not convinced this issue should be left for discussion later;
>>  >it is relevant when considering how the move beyond ASCII is specified.
>>  >
>>  >Cheers
>  > >
>>  >Simon
>>  >
>>  >
>>  >
>>  >
>>  >From: Herbert J. Bernstein 
>><<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>>  >To: Group for discussing encoding and content validation schemes for
>>  >CIF2 <<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>>  >Sent: Sunday, 26 September, 2010 23:14:55
>>  >Subject: Re: [Cif2-encoding] How we wrap this up
>>  >
>>  >Dear Simon,
>>  >
>>  >  The current CIF2 spec, with or without the changes I have suggested
>>  >to temporarily resolve the encoding issue is at best vague and
>>  >confusing on the elide character issue.  The interacting issue on
>>  >which the CIF2 spec
>>  >is clear is that we are changing the handling of quoted strings so
>>  >that they end on the first occurrence of the quoting character and leaves
>>  >the handling of elides to the calling application.
>>  >
>>  >  This will be a problem -- the change from CIF1 in the termination of
>>  >quoted strings along with the absence of a way of eliding the quotes
>>  >will invalidate a significant number of existing CIFS without any simple
>>  >mechanism to recover.  Rather than reopen another endless discussion,
>>  >I would suggest we simply add the python string concatenation character
>>  >"+" to ensure we can map all current CIF1 files and use Brian's common
>>  >semantic features for the moment.  We can then deal with the full elides
>>  >discussion at a future date.
>>  >
>>  >  Regards,
>>  >    Herbert
>>  >
>>  >
>>  >
>>  >
>>  >
>>  >At 1:40 PM -0700 9/26/10, SIMON WESTRIP wrote:
>>  >>Dear all
>>  >>
>>  >>While reviewing my hypothetical 'to do' list for implementing CIF2
>>  >>in current software, I realized that
>>  >>the issue of current support for elided character codes hasnt really
>>  >>been addressed in the context of CIF2.
>>  >>My 'to do' list contains notes that software could treat them as
>>  >>keyboard shortcuts, and their use could be
>>  >>defined in the dictionary. However, that was based on a distinct
>>  >>difference between CIF1 and CIF2,
>>  >>while the current arguments for 'as for CIF1...' suggest that the
>>  >>distinction between CIF1 and CIF2
>>  >>should almost be imperceptible.
>>  >>
>>  >>How is this issue to be addressed in the specification?
>>  >>
>>  >>Cheers
>>  >>
>>  >>Simon
>>  >>
>>  >>
>>  >>
>>  >>From: Herbert J. Bernstein
>>  >><<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>>  >>To: Group for discussing encoding and content validation schemes for
>>  >>CIF2 
>><<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>>  >>Sent: Saturday, 25 September, 2010 20:37:46
>>  >>Subject: Re: [Cif2-encoding] How we wrap this up
>>  >>
>>  >>Thank you for your cooperation. -- Herbert
>>  >>
>>  >>=====================================================
>>  >>Herbert J. Bernstein, Professor of Computer Science
>>  >>  Dowling College, Kramer Science Center, KSC 121
>>  >>        Idle Hour Blvd, Oakdale, NY, 11769
>>  >>
>>  >>                +1-631-244-3035
>>  >>
>>  >><mailto:<mailto:<mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:<mailto:yaya@dowling.ed>yaya@dowling.ed
>>  u><mailto:yaya@dowling.edu>yaya@dowling.edu
>>  >>=====================================================
>>  >>
>>  >>On Sat, 25 Sep 2010, SIMON WESTRIP wrote:
>>  >>
>>  >>>  OK - as promised, I wont pursue the matter :-)
>>  >>>
>>  >>>
>>  >>>
>>  >>>________________________________________________________________________
>>  ____
>>  >>>  From: Herbert J. Bernstein
>>  >>><<mailto:<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.c>yaya@bernstein-plus-sons.c
>> 
>>om><mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>>  >>>  To: Group for discussing encoding and content validation schemes for
>>  CIF2
>>  >>>
>>  >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c
>> 
>><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>  > >>>  Sent: Saturday, 25 September, 2010 19:18:54
>>  >>>  Subject: Re: [Cif2-encoding] How we wrap this up
>>  >>>
>>  >>>  Dear Simon,
>>  >>>
>>  >>>    Unfortunately, that is likely to take us back into our infinite loop
>>  or
>>  >>>  into a diverging spiral.  Right now, we would have UTF8 as no
>>  >>>more or less a
>>  >>>  default for CIF2 than ASCII is for CIF1 -- i.e. a not too bad
>>  >>>first guess as
>>  >>>  the likely default encoding for any given CIF, but not a formal
>>  >>>constraint.
>>  >>>  I would suggest we leave the wording in that imprecise state, get CIF2
>>  out
>>  >>>  and accepted and then work further on the encoding issue.
>>  >>>
>>  >>>    Regards,
>>  >>>      Herbert
>>  >>>
>>  >>>  =====================================================
>>  >>>  Herbert J. Bernstein, Professor of Computer Science
>>  >>>    Dowling College, Kramer Science Center, KSC 121
>>  >>>          Idle Hour Blvd, Oakdale, NY, 11769
>>  >>>
>>  >>>                  +1-631-244-3035
>>  >>>
>>  >>><mailto:<mailto:<mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:<mailto:yaya@dowling.e>yaya@dowling.e
>>  du><mailto:yaya@dowling.edu>yaya@dowling.edu
>>  >>>  =====================================================
>>  >>>
>>  >>>  On Sat, 25 Sep 2010, SIMON WESTRIP wrote:
>>  >>>
>>  >>>  > Dear all
>>  >>>  >
>>  >>>  > In the event that CIF2 adopts the 'any encoding' approach,
>>  >>>would there be
>>  >>  > > any objections to
>>  >  >>  > explicitly defining a default encoding in the specification, to be
>>  >>>  defaulted
>>  >>>  > to when there were no indications
>>  >>>  > to the contrary. At worst this would give CIF2 service
>>  >>>providers an excuse
>>  >>>  > to interpret CIFs as e.g. UTF8 if they couldnt
>>  >>>  > determine the encoding by other means - but such intollerant service
>>  >>>  > providers would soon find that their service is
>>  >>>  > not successful - while at best this might raise awareness of the
>>  issues
>>  >>>  > regarding encoding once non-ASCII is used in
>>  >>>  > a CIF. Essentially, it does not require users to change there working
>>  >>>  > practices, which is one of the main arguments for
>>  >>>  > 'any encoding'.
>>  >>>  >
>>  >>>  > So, CIF2 would remain 'any encoding', and specifications in
>>  >>>terms of e.g.
>>  >>>  > "Herbert's as for CIF1..."
>>  >>>  > might only require a single sentence to define the default after
>>  stating
>>  >>>  > what the 'preferred' encoding was;
>>  >>>  > the proposal might be phrased as "Herbert's as for CIF1..." +
>>  "explicit
>>  >>>  > default encoding"?
>>  >>>  >
>>  >>>  > I do not wish to prolong this debate - if there are objections
>>  >>>I will not
>>  >>>  > launch into an endless round of exchanges
>>  >>>  > that cover the same ground that has led us this far.
>>  >>>  >
>>  >>>  > Cheers
>>  >>>  >
>>  >>>  > Simon
>>  >>>  >
>>  >>>  >
>>  >>>  >
>>  >>>  >
>>  >>>  >
>>  >>>  >
>>  >>>
>>  >>>>_______________________________________________________________________
>>  ____
>>  >>>  _
>>  >>>  > From: SIMON WESTRIP
>>  >>><<mailto:<mailto:<mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com><mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com
>>  ><mailto:<mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com><mailto:simonwestrip@btinternet.com>simonwestrip@btinternet.com>
>>  >>>  > To: Group for discussing encoding and content validation
>>  >>>schemes for CIF2
>>  >>>  >
>>  >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c
>> 
>><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>>  >>>  > Sent: Friday, 24 September, 2010 20:10:13
>>  >>>  > Subject: Re: [Cif2-encoding] How we wrap this up
>>  >>>  >
>>  >>>  > Dear James
>>  >>>  >
>>  >>>  > As you may have gathered I have been reconsidering my position on
>>  this
>>  >>>  > issue.
>>  >>>  > Please forgive me, but I would like to change my vote if that is OK,
>>  in
>>  >>>  > favour of the 'any encoding' camp.
>>  >>>  > This apparent U-turn is not a response to recent
>>  >>>contributions; rather it
>>  >>>  is
>>  >>>  > the outcome of a meeting I had this morning
>  > >>>  > where I demonstrated some new software to the Managing 
>Editor of IUCr
>>  >>>  > journals.
>>  >>>  >
>>  >>>  > By way of explanation:
>>  >>>  >
>>  >>>  > I have been developing a new docx template which the IUCr
>>  >>>editorial office
>>  >>>  > is shortly to release for use by
>>  >>>  > authors. The template will be packaged with some tools to extract
>>  data
>>  >>>  from
>>  >>>  > CIFs
>>  >>>  > and tabulate them in the Word document, e.g. open an mmCIF, click a
>>  >>>  button,
>>  >>>  > and standard
>>  >>>  > tables populated with data from the CIF will be included in
>>  >>>the document,
>>  >>>  > acting as
>>  >>>  > table templates for the author to edit as appropriate for their
>>  >>>  manuscript.
>>  >>>  >
>>  >>>  > Inclusion of the mmCIF tools is part of an unofficial policy to
>>  'coax'
>>  >>>  > biologists to start using/accepting mmCIF
>>  >>>  > as a useful medium, rather than as a product of their deposition to
>>  the
>>  >>>  PDB,
>>  >>>  > and to encourage them to become comfortable
>>  >>>  > with passing mmCIFs between applications, and even to edit the
>>  >>>things (in
>>  >>>  > the same way as the core-CIF community
>>  >>>  > treats CIFs). For example, our perception is that there is no reason
>>  why
>>  >>>  an
>>  >>>  > author should not feel free to take an mmCIF
>>  >>>  > that has been created by e.g. pdb_extract and populate it using
>>  >>>  third-party
>>  >>>  > software before uploading to the PDB for
>>  >>>  > deposition.
>>  >>>  >
>>  >>>  > This cause would not be furthered by effectively invalidating
>>  >>>an mmCIF if
>>  >>>  it
>>  >>>  > were not to be encoded in one of
>>  >>>  > the specified encodings.
>>  >>>  >
>>  >>>  > So although I am uneasy about a specification that propogates
>>  >>>uncertainty,
>>  >>>  > I'm also uneasy about alienating users,
>>  >>>  > especially when we are struggling to change their mindset as in the
>>  case
>>  >>>  of
>>  >>>  > the biological community
>>  >>>  > (my perception of the biological community's attitude to mmCIF
>>  >>>is based on
>>  >>>  > feedback from authors/coeditors to
>>  >>>  > IUCr journals).
>>  >>>  >
>>  >  >>  > Granted this may not be the most compelling argument in favour of
>>  'any
>>  >>>  > encoding', but recognizing the hurdles that
>>  >>>  > may have to be overcome once we move beyond ASCII whatever the CIF2
>>  >>>  > specification, I support 'any encoding'
>>  >>>  > as 'a means to an end'.
>>  >>>  >
>>  >>>  > I will not provide my preferences in terms of the numbered options
>>  until
>>  >>  > you
>>  >>>  > say so; afterall, I have already voted and
>>  >>>  > all this has to be signed off by COMCIFs in any case.
>>  >>>  >
>>  >>>  > Cheers
>>  >>>  >
>>  >>>  > Simon
>>  >>>  >
>>  >>>  >
>>  >>>  >
>>  >>>  >
>>  >>>
>>  >>>>_______________________________________________________________________
>>  ____
>>  >>>  _
>>  >>>  > From: "Bollinger, John C"
>>  >>><<mailto:<mailto:<mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG><mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG><ma
>> 
>>ilto:<mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG><mailto:John.Bollinger@STJUDE.ORG>John.Bollinger@STJUDE.ORG>
>>  >>>  > To: Group for discussing encoding and content validation
>>  >>>schemes for CIF2
>>  >>>  >
>>  >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c
>> 
>><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>>  >>>  > Sent: Friday, 24 September, 2010 14:50:57
>>  >>>  > Subject: Re: [Cif2-encoding] How we wrap this up
>>  >>>  >
>>  >>>  > Dear Simon,
>>  >>>  >
>>  >>>  > It is exactly this sort of issue that drove me to support more
>>  >>>permissive
>>  >>>  > encoding rules and ultimately to devise the UTF-8 + UTF-16 + local
>>  >>>  proposal.
>>  >>>  >
>>  >>>  > Do please think about the considerations Herb raised.  As you
>>  reconsider
>>  >>>  > your votes, I urge you also to ask yourself what, *precisely*, a
>>  "text
>>  >>>  file"
>>  >>>  > is, and to consider whether your answer is functionally
>>  >>>different from my
>>  >>>  > "local".  If you decide not, then please consider what that
>>  >>>answer implies
>  > >>>  > about CIF2 support of UTF-8 and UTF-16 (which evidently you favor)
>>  under
>>  >>>  > each option on the table, especially for CIFs containing non-ASCII
>>  >>>  > characters.  Whatever you decide about the meaning of "text
>>  >>>file", please
>>  >>>  > consider whether reasonable people might reach a different
>>  >>>conclusion, as
>>  >>>  I
>>  >>>  > assert they might do, and to what extent the standard needs to
>>  address
>>  >>>  that.
>>  >>>  >
>>  >>>  >
>>  >>>  > Regards,
>>  >>>  >
>>  >>>  > John
>>  >>>  > --
>>  >>>  > John C. Bollinger, Ph.D.
>>  >>>  > Department of Structural Biology
>>  >>>  > St. Jude Children's Research Hospital
>>  >>>  >
>>  >>>  >
>>  >>>  > >From:
>>  >>><mailto:<mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iuc
>> 
>>r.org><mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org><mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org
>>
>>  >>>  >
>>  >>>[mailto:<mailto:<mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org>cif2-encoding-bou
>> 
>><mailto:nces@iucr.org>nces@iucr.org><mailto:<mailto:cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@iucr.org>cif2-encoding-bounces@
>>  iucr.org]
>>  >>>On Behalf Of SIMON WESTRIP
>>  >>>  > >Sent: Friday, September 24, 2010 7:53 AM
>>  >>>  > >To: Group for discussing encoding and content validation
>>  >>>schemes for CIF2
>>  >>>  > >Subject: Re: [Cif2-encoding] How we wrap this up. .
>>  >>>  > >
>>  >>>  > >Dear Herbert
>>  >>>  > >
>>  >>>  > >Not for the first time, I find your arguement persuasive. Brian's
>>  vote
>>  >>>  and
>>  >>>  > explanation have also raised some
>>  >>>  > >questions that I would like to look into.
>>  >>>  > >
>>  >>>  > >I will confirm or otherwise my vote as soon as possible,
>>  >>>assuming that is
>>  >>>  > OK with James and assuming that
>>  >>>  > >this round of votes might wrap this up.
>>  >>>  > >
>>  >>>  > >Cheers
>>  >>>  > >
>>  >>>  > >Simon
>>  >>>  > >
>>  >>>  > >________________________________________
>>  >>>  > >From: Herbert J. Bernstein
>>  >>><<mailto:<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.c>yaya@bernstein-plus-sons.c
>> 
>>om><mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>>  >>>  > >To: Group for discussing encoding and content validation
>>  >>>schemes for CIF2
>>  >>>  >
>>  >>><<mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:c
>> 
>><mailto:if2-encoding@iucr.org>if2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org>
>>  >>>  > >Sent: Friday, 24 September, 2010 13:17:14
>>  >>>  > >Subject: Re: [Cif2-encoding] How we wrap this up
>>  >>>  > >
>>  >>>  > >If he ignores the standard, in most cases all he has to do to
>>  >>>comply with
>>  >>>  > CIF2 is to run whatever applications he currently runs to produce
>>  CIF1
>>  >>>  and,
>>  >>>  > perhaps, in some cases, run a minor edit pass at the end, to convert
>>  for
>>  >>>  the
>>  >>>  > minor syntactive differences and/or changed tags required to comply
>>  with
>>  >>>  > CIF2 and the new dictionaries, but he is unlikely to have to do
>>  anything
>>  >  >>  to
>>  >>>  > deal with the messy business of whether his encoding is really a
>>  proper
>>  >>>  UTF8
>>  >>>  > encoding or not.
>>  >>>  >
>>  >>>  > >The punishment if he tries to comply, is that he has to totally
>>  uproot
>>  >>>  and
>>  >>>  > reconfigure the environment in which he produces CIFs from
>>  >>>whatever he is
>>  >>>  > currently doing to create an enviroment in which he can reliably
>>  create
>>  >>>  and,
>>  >>>  > more importantly, transmit compliant UTF8 files.  This can be
>>  >>>very tricky
>>  >>>  if
>>  >>>  > he does only a partial job, say fudging in one special
>>  >>>application (yet to
>>  >>>  > be written), because if he stays with his old system, all kinds of
>>  tools
>>  >>>  > will keep trying to transcode whatever he has produced back to
>>  whatever
>>  >>>  his
>>  >>>  > system considers a standard. Those of us who have files,
>>  >>>applications and
>  > >>>  > tools that have lived through several generations of macs are
>>  >>>living proof
>>  >>>  > of the problem. Macs now have excellent UTF8/16 unicode
>>  >>>support, but every
>>  >>  > > once in a while in working with a unicode file I find it has been
>>  >>>  strangely
>>  >>>  > and unexpectedly converted to something else, and it can be
>>  >>>really tricky
>>  >>>  to
>>  >>>  > spot when the unaccented roman text part has been left
>>  >>>untouched but just
>>  >>>  a
>>  >>>  > few accen
>>  >>>  > ted letters have gotten different accents.
>>  >>>  >
>>  >>>  > >Mandating UTF8 is simply trying to shift a serious software
>>  >>>problem from
>>  >>>  > the central handlers of CIF (IUCr, PDB, etc.) to the external
>>  >>>users. Most
>>  >>>  > users will probably have the good sense to simply ignore the demand
>>  and
>>  >>>  > leave the burden just where it is now.  A few sophisticated users
>>  will
>>  >>>  > probably adapt with no trouble, but the punishment for those users
>>  who
>>  >>>  > blindly follow orders before we have a complete multiplatform
>>  supporting
>>  >>>  > infrastructure in place by mandating UTF8 is severe, expensive and
>>  >>>  > undeserved.  Until and unless we have developed solid support, we
>>  will
>>  >>>  just
>>  >>>  > be alienating people from CIF.  I will continue to oppose such a
>>  move.
>>  >>>  >
>>  >>>  > [...]
>>  >>>  >
>>  >>>  >
>>  >>>  > Email Disclaimer:
>>  >>><<<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclaimer><http://www.stjude.org/emaildiscl>http://www.stjude.org/emaildiscl
>> 
>>aimer><<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclaimer><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
>>
>>  >>>  > _______________________________________________
>>  >>>  > cif2-encoding mailing list
>>  >>>  >
>>  >>><mailto:<mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:ci
>> 
>><mailto:f2-encoding@iucr.org>f2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org
>>  >>>  >
>>  >>><<<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts>http://scripts.
>> 
>>iucr.org/mailman/listinfo/cif2-encoding><<http://scripts.iucr.org/mailman/li>http://scripts.iucr.org/mailman/li
>> 
>>stinfo/cif2-encoding><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>>
>>  >>>  >
>>  >>>  >
>>  >>>
>>  >>>
>>  >>
>>  >>_______________________________________________
>>  >>cif2-encoding mailing list
>>  >><mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org
>>  >><<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts.iu>http://scripts.iu
>>  cr.org/mailman/listinfo/cif2-encoding
>>  >
>>  >
>>  >--
>>  >=====================================================
>>  >  Herbert J. Bernstein, Professor of Computer Science
>>  >    Dowling College, Kramer Science Center, KSC 121
>>  >        Idle Hour Blvd, Oakdale, NY, 11769
>>  >
>>  >                  +1-631-244-3035
>>  > 
>><mailto:<mailto:yaya@dowling.edu>yaya@dowling.edu><mailto:yaya@dowling.edu>yaya@dowling.edu
>>  >=====================================================
>>  >_______________________________________________
>>  >cif2-encoding mailing list
>>  ><mailto:<mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org
>>  ><<http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding><http://scripts.iuc>http://scripts.iuc
>>  r.org/mailman/listinfo/cif2-encoding
>>  >
>>  >
>>  >_______________________________________________
>>  >cif2-encoding mailing list
>>  ><mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org
>>  ><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>>
>>
>>  --
>>  =====================================================
>>    Herbert J. Bernstein, Professor of Computer Science
>>      Dowling College, Kramer Science Center, KSC 121
>  >         Idle Hour Blvd, Oakdale, NY, 11769
>>
>>                    +1-631-244-3035
>>                    <mailto:yaya@dowling.edu>yaya@dowling.edu
>>  =====================================================
>>  _______________________________________________
>>  cif2-encoding mailing list
>>  <mailto:cif2-encoding@iucr.org>cif2-encoding@iucr.org
>> 
>><http://scripts.iucr.org/mailman/listinfo/cif2-encoding>http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>>
>>
>
>_______________________________________________
>cif2-encoding mailing list
>cif2-encoding@iucr.org
>http://scripts.iucr.org/mailman/listinfo/cif2-encoding


-- 
=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================
_______________________________________________
cif2-encoding mailing list
cif2-encoding@iucr.org
http://scripts.iucr.org/mailman/listinfo/cif2-encoding

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.