[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] How we wrap this up

To: Group for discussing encoding and content validation schemes for CIF2 <cif2-encoding@xxxxxxxx>
Subject: Re: [Cif2-encoding] How we wrap this up
From: "Herbert J. Bernstein" <yaya@xxxxxxxxxxxxxxxxxxxxxxx>
Date: Tue, 28 Sep 2010 07:58:10 -0400 (EDT)
In-Reply-To: <646265.82162.qm@web87004.mail.ird.yahoo.com>
References: <AANLkTi=hmKNFMgaeMqt69=sG6dOmxZRUrffB1khjF+mZ@mail.gmail.com><526633.3484.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009240742480.8859@epsilon.pair.com><613218.81205.qm@web87011.mail.ird.yahoo.com><281388.90819.qm@web87012.mail.ird.yahoo.com><463665.7127.qm@web87004.mail.ird.yahoo.com><alpine.BSF.2.00.1009251413550.93269@epsilon.pair.com><262880.46378.qm@web87002.mail.ird.yahoo.com><alpine.BSF.2.00.1009251537250.57408@epsilon.pair.com><a06240800c8c5653f38cf@192.168.2.104><476110.27334.qm@web87005.mail.ird.yahoo.com><a06240805c8c6224b8789@192.168.2.104><a06240803c8c65f78a402@149.72.2.199><8F77913624F7524AACD2A92EAF3BFA5416659DEDE3@SJMEMXMBS11.stjude.sjcrh.local><alpine.BSF.2.00.1009271801070.86201@epsilon.pair.com><alpine.BSF.2.00.1009271900080.86201@epsilon.pair.com><AANLkTikudiXBk7orHSAH=JonoeQHeNXVrzvAZmH3Wt94@mail.gmail.com><646265.82162.qm@web87004.mail.ird.yahoo.com>

Dear Colleagues,

   I am puzzled.  People do not seem to want to have a meeting, but
they do seem to want to keep "talking" in the form of emails that
repeat points we all have discussed many, many times.  Please
recgnize that this is not working.  A meeting or e-meeting also
may not work, but it is something we have not tried, and many
other times in the history of CIF, meetings have resolved
similarly seemingly intractable issues.

   Please, let's try having a Skype conference call.

   Have to go now -- time for my next Skype conference call, second
one this morning.  They really do seem to help.

   Regards,
     Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Tue, 28 Sep 2010, SIMON WESTRIP wrote:

> "I [John] will not support an alternative that fails to make UTF-8 a universally
> supported character encoding for  CIF, and it seems clear that James will
> not, either."
> 
> Herbert has recently stated
> 
> "I personally have no objection to specification of UTF-8
> as a default encoding in the absence of indications of some
> different encoding."
> 
> To me, this supports making UTF-8 a universally supported character encoding.
> I would hope that a developer reading this might conclude that if the encoding is
> not recognizable (UTF16 is recognizable), then they should default to UTF8,
> so CIF software must be able to handle UTF8.
> 
> A user reading this, especially in the context of recommendations, would also
> hopefully
> conclude that UTF8 is the default encoding, and be aware that if they wish to make
> use
> of the new non-ASCII support offered by CIF2, they may have to pay attention to how
> that
> new feature should be used. In the end they can continue to do what thay have always
> done
> when editing CIFs, which is to work with ASCII.  'As for CIF1...' allows them to do
> this without
>  having to worry about the encoding; while the specification of a default encoding
> gives
> both developers and users the means to make use of non-ASCII, without uncertainty.
> 
> So I think the 'As for CIF1...' proposals with this explicit default encoding is
> certainly
> heading towards a workable compromise. Herbert is unhappy to mandate a particular
> encoding
> for non-ASCII use, but has agreed to recommend UTF8 and UTF16 in such cases.
> Such recommendations along with a default encoding that should be adopted in the
> absence of
> any pointers to the contrary could boil down to UTF8/16 + local in all intents and
> purposes,
> and could boil down to UTF8/16 if you want to use non-ASCII text.
> 
> Cheers
> 
> Simon
> 
> 
> _____________________________________________________________________________________
> From: James Hester <jamesrhester@gmail.com>
> To: Group for discussing encoding and content validation schemes for CIF2
> <cif2-encoding@iucr.org>
> Sent: Tuesday, 28 September, 2010 7:40:40
> Subject: Re: [Cif2-encoding] How we wrap this up
> 
> I happily confess to agreeing with John on this one.
> 
> I am beginning to think I might understand Herbert's viewpoint.  Correct me Herbert
> if I am wrong, but you are suggesting that current CIF1 workflows are implicitly or
> explicitly choosing some encoding when they operate with CIF1 files.  If we mandate
> some other encoding for CIF2, then those workflows will be adversely affected as they
> will need to choose a different encoding.
> 
> I'll critique this further if Herbert confirms that this is what he is getting at.
> 
> On Tue, Sep 28, 2010 at 9:14 AM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
> wrote:
>       Now to the substance of John's argument, which he also attributes to
>       James:
>
>       "I will not support an alternative that fails to make UTF-8 a universally
>       supported character encoding for  CIF, and it seems clear that James will
>       not, either."
> 
> To me it seems the key to this position is "universally supported" which
> would seem to imply that there be full documentation and software to allow
> CIF users to work within that context.  The practicality of that position
> comes down to the IUCr and the PDB working out workflows and the necessary
> support infrastructure to get the entire small and macromlecular CIF
> community to make the transition.  Do the IUCr journal operation and the
> PDB have plans and a realistic timeline to do this?  If not, then are CIF2
> and dREL to wait on the sidelines until then?  If so, may we please see
> them to get a sense of when the transition could happen?
> 
> I suspect we are 90% of the way to where we need to be, but in the same
> sense as 90% is used in the old saw:  The first 90% of the effort takes
> the first 90% of the time, and the last 10% of the effort takes the last
> 90% of the time.  But I may be wrong.  Let's see the plans for the
> proposed transition for the IUCr journal operation and the PDB, dealing
> with both core CIF and mmCIF in a UTF8 CIF2 world.
> 
> Regards,
>   Herbert
> 
> =====================================================
>  Herbert J. Bernstein, Professor of Computer Science
>    Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
> 
>                  +1-631-244-3035
>                  yaya@dowling.edu
> =====================================================
> 
> On Mon, 27 Sep 2010, Herbert J. Bernstein wrote:
> 
> > Dear Colleagues,
> >
> >   John declines to join in a meeting because ...
> >
> > "Is there anything to gain?  The last few days have been more illuminating
> > than the last several weeks, but it still seems evident to me that there
> > is a fundamental difference of opinion.  I will not support an alternative
> > that fails to make UTF-8 a universally supported character encoding for
> > CIF, and it seems clear that James will not, either.  You seem adamant
> > that there be no such universal requirement.  I think I understand your
> > position better than I used to do, but I don't see where there is any
> > scope for a consensus.  My best offer is already on the table in option
> > (5) +- UTF-16."
> >
> > I a very sorry to hear that.  I hope the rest of you will have the
> > courtesy to participate in a Skype meeting.  Perhaps no new facts
> > or logic will come to light.  Perhaps something will come to light
> > that leads to better common understanding and concensus.  We'll never know
> > unless we try.  I for one think that most of us are open minded and
> > willing to try to reach an accomodation that serves the community well.
> >
> >   Regards,
> >     Herbert
> >
> > =====================================================
> >  Herbert J. Bernstein, Professor of Computer Science
> >    Dowling College, Kramer Science Center, KSC 121
> >         Idle Hour Blvd, Oakdale, NY, 11769
> >
> >                  +1-631-244-3035
> >                  yaya@dowling.edu
> > =====================================================
> >
> > On Mon, 27 Sep 2010, Bollinger, John C wrote:
> >
> >> Dear Herb,
> >>
> >> On Monday, September 27, 2010 8:45 AM, Herbert J. Bernstein wrote:
> >>
> >>> The problem is that options 3,4 and 5 specifically prescribe the use of
> Unicode characters (that is the entire point of those options -- and that is
> the point in dispute -- whether we should be prescribing UTF8 or using is as we
> now use ASCII, as a way to be clear what we are talking about as in CIF1) and
> we simply are not ready to deal such a requirement yet.
> >>
> >> I think I have reached my own epiphany regarding your position.  Do correct
> me if I am wrong, but I now think you're saying that you don't want to
> distinguish any particular encoding(s) as universally acceptable (much less
> universally required), correct?  If so, would it be fair to describe that as
> just the "local" part of option 5?
> >>
> >> [...]
> >>
> >> On Monday, September 27, 2010 12:07 PM, Herbert J. Bernstein wrote:
> >>
> >>> Ah, now I begin to understand the difference in our view.  I view CIF for
> >>> journal use and PDB deposition as having a controlled vocabulary, via
> >>> combinations of dictionaries, advice to authors, deposition standards,
> >>> etc.  You seem to few CIF as allowing completely arbitrary, uncontrolled
> >>> text.  [...]
> >>
> >> Yes, I intentionally take an unilluminated view of the problem, but that is
> both purposeful and useful.  Text is the foundation on which CIF is built.  The
> bulk of the spec is devoted to defining which text conforms and which does not.
>  The "F" in "CIF" stands for "file," however, and if the spec is to answer the
> question of which *files* conform, or the related question of what a particular
> file means, then it needs to address the mapping between "text" and "file".
>  Options (1) and (2) seem crafted specifically to avoid doing so.
> >>
> >> I understand using local convention to fill the gap (ala "local"), but I
> fail to see how any amount of author instructions, deposition standards, etc.
> can adequately do the same.  At best that moves a burden that rightfully should
> be borne by the format spec onto application-dependent external documents, some
> outside IUCr's control.  I have shown by my advocacy for option (5) that I am
> willing to make the definition of a conformant CIF system-dependent.  I
> acknowledge that that various applications place different demands on the data
> content of CIFs they consume.  I am not, however, willing to make the basic
> definition of CIF conformance application-dependent.
> >>
> >>> Please note that proposals 1 and 2 do _not_ affect "which
> >>> byte-sequence representations of those characters will conform to
> >>> CIF2, under which circumstances" because they are not rigidly
> >>> prescriptive about any
> >>> particular byte sequences.
> >>
> >> Options (1) and (2) certainly DO affect that question, if only by leaving it
> open to later, possibly conflicting, interpretation by COMCIFS, individual
> developers, and others.  Option 5 is about as permissive as it reasonably can
> be regarding the binary form a CIF may take, while still being definitive
> enough that general-purpose software can be written to read conformant CIFs.
>  If my new understanding of your viewpoint is correct, however, then your
> objection may be that option 5 is *too* permissive on account of its explicit
> allowance for UTF-8 and UTF-16.  I would be willing to drop the explicit UTF-16
> support (though UTF-16 might nevertheless squeeze in as "local" in some
> environments).  I will under no circumstances, however, support an alternative
> that allows any file to be found non-conformant on account of its being encoded
> in UTF-8.
> >>
> >> [...]
> >>
> >>> This is really getting out of hand.  We need a meeting.  If
> >>> everyone will send me their Skype id's, I will volunteer to
> >>> set up a Skype conference call at some time that works for
> >>> everybody (which I suspect will be 4 am EDT).  My guess is that
> >>> 1-2 hours of polite discussion will resolve this.  What
> >>> do we have to lose?
> >>
> >> Is there anything to gain?  The last few days have been more illuminating
> than the last several weeks, but it still seems evident to me that there is a
> fundamental difference of opinion.  I will not support an alternative that
> fails to make UTF-8 a universally supported character encoding for CIF, and it
> seems clear that James will not, either.  You seem adamant that there be no
> such universal requirement.  I think I understand your position better than I
> used to do, but I don't see where there is any scope for a consensus.  My best
> offer is already on the table in option (5) +- UTF-16.
> >>
> >>
> >> Respectfully,
> >>
> >> John
> >> --
> >> John C. Bollinger, Ph.D.
> >> Department of Structural Biology
> >> St. Jude Children's Research Hospital
> >>
> >>
> >> Email Disclaimer:  www.stjude.org/emaildisclaimer
> >>
> >> _______________________________________________
> >> cif2-encoding mailing list
> >> cif2-encoding@iucr.org
> >> http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> >>
> > _______________________________________________
> > cif2-encoding mailing list
> > cif2-encoding@iucr.org
> > http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> >
> _______________________________________________
> cif2-encoding mailing list
> cif2-encoding@iucr.org
> http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> 
> 
> 
> 
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> 
>

_______________________________________________
cif2-encoding mailing list
cif2-encoding@iucr.org
http://scripts.iucr.org/mailman/listinfo/cif2-encoding

Reply to: [list | sender only]

References:

[Cif2-encoding] How we wrap this up (James Hester)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Re: [Cif2-encoding] How we wrap this up (Bollinger, John C)

Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)

Re: [Cif2-encoding] How we wrap this up (Herbert J. Bernstein)

Re: [Cif2-encoding] How we wrap this up (James Hester)

Re: [Cif2-encoding] How we wrap this up (SIMON WESTRIP)

Prev by Date: Re: [Cif2-encoding] How we wrap this up

Next by Date: Re: [Cif2-encoding] How we wrap this up

Prev by thread: Re: [Cif2-encoding] How we wrap this up

Next by thread: Re: [Cif2-encoding] How we wrap this up

Index(es):

Date

Thread

Discussion List Archives

Re: [Cif2-encoding] How we wrap this up