Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] How we wrap this up

I think the crux of issue is as follows:

[But part of our difficulty is that we are all having separate
epiphanies, and focusing on five different "cruxes". Clarifying
the real divergence between our views would be a genuine benefit of
a Skype conference, to which I have no personal objection.]

In the real world, a need may arise to exchange CIFs constructed in
non-canonical encodings. ("Canonical" probably means UTF-8 and/or
UTF-16). Such a need would involve some transcoding strategy.

What is the actual likelihood of that need arising?

I would characterise James's position as "not very, and even less
if the software written to generate CIFs is constrained to use
canonical encodings within the standard".

I would characterise the position of the rest of us as "reasonable to
high, so that we wish to formulate the standard in a way that
recognises non-canonical encodings and helps to establish or at
least inform appropriate transcoding strategies". There appear to be
strong disagreements among us, but in fact there's a lot of common
ground, and a drafting exercise would probably move us towards a
consensus.

Do you agree that that is a fair assessment?

If so, we can analyse further: what are the implications of mandating
a canonical encoding or not if judgement (a) is wrong and if judgement
(b) is wrong? My feeling is that the world will not end - or even
change very much - in any case; but it could determine whether we
need to formulate an optimal transcoding strategy now, or can defer
it to a later date.

However, if anyone thinks this is just another diversion, I'll drop
this line of approach so as not to slow things down even more.

Regards
Brian

On Tue, Sep 28, 2010 at 09:28:25PM -0400, Herbert J. Bernstein wrote:
> John,
> 
> Now I am totally confused about what you are proposing and agree with Simon
> that what is needed for you to state your proposal as the precise wording
> that you propose to insert and/or change in the current CIF2 change document
> "5 July 2010: draft of changes to the existing CIF 1.1 specification 
> for public discussion"
> 
> If I understand your proposal correctly, the _only_ thing you are proposing
> that differs in any way from my proposed motion is a mandate that a 
> CIF2 conformant reader must be able to read a UTF8 CIF2 file, but 
> that _no_ CIF application would actually be required to provide such 
> code, provided there was some mechanism available to transcode from 
> UTF8 to the local encoding,
> which does not seem to be a mandate on the conformant CIF2 reader at
> all, but a requirement for the provision of a portable utility to
> do that external transcoding.
> 
> If that is the case, wouldn't it make more sense to just provide that
> utility that to argue about whether my motion requires somebody to write
> their own?  Having the utility in hand would avoid having multiple,
> conflicting interpretations of this input transcoding requirement.
> 
> If I have read your message correctly, please just write the utility you
> are proposing.  If I have read your message incorrectly, please
> write the specification changes you propose for the draft changes
> in place of the changes in my motion.
> 
> _This_ is why it was, is, and will remain a good idea to simply have
> a meeting and talk these things out.
> 
> 
> 
> At 5:21 PM -0500 9/28/10, Bollinger, John C wrote:
> >Dear Herb,
> >
> >On Tuesday, September 28, 2010 2:41 PM, Herbert J. Bernstein
> >
> >>    The norm in standards work is to deprecate features for a while
> >>(at least months and preferably years) before you remove them.
> >
> >I acknowledge that principle, and I see no incompatibility between 
> >it and option 5.  More below.  Do not overlook my final comments.
> >
> >>>  Recommending UTF-8 and / or UTF-16 without mandating support for one or
> >>>  both does not get us where I insist we need to be.
> >>
> >>The problem is coming to agreement on "support" and that pesky word
> >>"mandating".
> >
> >By "mandating support" I mean that a file containing a sequence of 
> >characters conforming to the CIF syntax and encoded via UTF-8 is 
> >defined to be a conformant CIF everywhere.  By itself, that would 
> >not obligate anyone to encode their CIFs in UTF-8.  It would, 
> >however, mean that fully-conformant CIF2 readers must be prepared to 
> >accept CIFs encoded in that manner.  Even that is no barrier to 
> >adoption, though, for CIF users must be prepared to deal with the 
> >encoding question under any alternative on the table, and if they 
> >can read only their local encoding then they would need to be able 
> >to transcode in any event.
> >
> >>   Up until now in order for a CIF application developer or
> >>user to produce compliant CIFS, all they had to do was to produce a text
> >>file in whatever encoding was provided on their system. Now you wish to
> >>mandate that they be able to produce UTF8 or UTF16, even if they are
> >>running on some code-page based system.
> >
> >Not at all.  It is the single objective of the "+ local" provision 
> >of my preferred alternative to enable application developers, 
> >authors, and anyone else to continue to do exactly what you describe 
> >them already doing.
> >
> >[...]
> >
> >>We have already made that mistake with other CIF2 features, e.g. the
> >>drastic change in string quoting.
> >
> >I agree with you that such changes are mistaken.  That was my 
> >motivation in questioning UTF-8 only to begin with.
> >
> >[...]
> >
> >>The motion I have proposed does not make anything worse for anybody
> >>currently using CIF and allows them to start moving into CIF2 now.
> >
> >Neither your motion nor my preferred one make anything worse for 
> >anybody using CIF1, and both allow them to start moving into CIF2 
> >now.
> >
> >>   Your
> >>approach imposes conditions it will take months or years to meet with no
> >>prospect that satisfying your demands will solve any problem for anybody.
> >
> >My approach imposes no special conditions, but it offers the 
> >advantages of UTF-8 as an available standard feature.  As long as we 
> >are relying on "The norm in standards work," wouldn't you agree that 
> >it is normal to introduce new features to a standard no later than 
> >the time the features they supersede are deprecated?
> >
> >>Please rethink your position.
> >
> >I have considered my position carefully, and rethought it several 
> >times over the course of our discussion.  I firmly believe that I am 
> >advocating a solid and eminently workable compromise between support 
> >of the existing CIF1 base and the future needs of CIF2 users.
> >
> >>If we recommend UTF8/UTF16 support we have a decent chance that somebody
> >>will simply provide it.  If we mandate UTF8/UTF16 support we force
> >>pointless delays in the adoption of the rest of CIF2 and gain what in
> >>exchange?
> >
> >Even if CIF2 ended up UTF-8 only, people could write software 
> >exactly as they would do under your proposal, then wrap it in a 
> >transcoder.  Or in that event I think it likely that some would 
> >implement my preferred alternative (5) as an extension.  Perhaps you 
> >would agree, as that's the same end result that you think likely 
> >somebody will simply provide, coming from the opposite direction.
> >
> >I see no reason to fear any significant delays in CIF2 adoption 
> >arising from any particular result this discussion may ultimately 
> >reach.
> >
> >[...]
> >
> >>However, the real answer (not a joke) is that a text encoding is whatever
> >>the formatted I/O system in a fortran compiler on the system under
> >>discussion reads and writes or the format of a COBOL EBCDIC-sequential
> >>file or a COBOL ASCII line-sequential file, or what a text editor on the
> >>system handles.  That is the point -- text is something very, very system
> >>and language dependent. The strange thing is that text files have a much
> >>longer practical survival time than binary files, as backwards as that may
> >>seem, because there is a much larger investment in ensuring the continued
> >>readbility of text files than of binary files.
> >
> >I am laughing, but not because I think you're joking.  As far as I 
> >can tell, that answer is functionally identical to what I have been 
> >advocating as "local".   It's even worded similarly.  My desire to 
> >include it (but not to be limited to it) is the primary difference 
> >between James's most preferred position and mine.
> >
> >
> >Best Regards,
> >
> >John
> >--
> >John C. Bollinger, Ph.D.
> >Department of Structural Biology
> >St. Jude Children's Research Hospital
> >
> >
> >Email Disclaimer:  www.stjude.org/emaildisclaimer
> >
> >_______________________________________________
> >cif2-encoding mailing list
> >cif2-encoding@iucr.org
> >http://scripts.iucr.org/mailman/listinfo/cif2-encoding
> 
> 
> -- 
> =====================================================
>   Herbert J. Bernstein, Professor of Computer Science
>     Dowling College, Kramer Science Center, KSC 121
>          Idle Hour Blvd, Oakdale, NY, 11769
> 
>                   +1-631-244-3035
>                   yaya@dowling.edu
> =====================================================
> _______________________________________________
> cif2-encoding mailing list
> cif2-encoding@iucr.org
> http://scripts.iucr.org/mailman/listinfo/cif2-encoding
_______________________________________________
cif2-encoding mailing list
cif2-encoding@iucr.org
http://scripts.iucr.org/mailman/listinfo/cif2-encoding

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.