Re: [Cif2-encoding] Revised Motion

Hi everybody:

I think it is fair to say that we are all agreed on the broad principle of the compromise position I proposed recently.  The current lack of consensus I interpret as a desire for a bit of technical polish.

One reason for the disparity is that my proposal was implicitly expressed in terms of a new paragraph to be added to our current 'Changes' document that is posted at http://www.iucr.org/__data/assets/pdf_file/0017/41426/cif2_syntax_changes_jrh20100705.pdf.  Yesterday, Herbert and I (for no particular reason) discussed the changes in the context of Herbert's motion, which I had interpreted as largely repeating the content of that 'Changes' document, with the exception of the encoding paragraphs.  I was not aware that there were any other controversial sections of Herbert's motion.  My expectation is that we would accept (or decline) Herbert's motion as a joint statement of our position, and then rework the 'Changes' document accordingly. 

Herbert: I notice (now) that the paragraph immediately preceding the paragraph that we changed could be interpreted as conflicting with the new paragraph that you and I wrote, because it appears to cover the whole code point range.   Could I suggest that it be replaced by the following:

It is understood that CIF2 documents may
be constructed and maintained on computers that implement other character
encodings.  For maximum portability only the clearly
identified equivalents to the Unicode characters identified above and
below should be used and use of UTF-8 for a concrete representation is
highly recommended.  However, for compatibility with CIF1 behaviour, there is no formal
restriction on the encoding of CIF2 files providing they contain only code points from the ASCII range.

Regarding the meaning of 'text': in the 'Changes' document, there is a section for definitions where I think we can define 'text' if we so wish; personally I think that writing 'plain text' instead of 'text' would be sufficient.

>   James and I had a good e-meeting and came up with the following
>revised wording.  If anybody objects to this motion, please speak
>up now.

With apologies, I object.  This proposal has exactly the same problem that options (1) and (2) did: it does not define "text file".  It is worse in this case, however, because the problem cannot be fixed merely by adding Herbert's definition (or mine).  In most environments that definition does not encompass UTF-8 encoded text containing non-ASCII characters, so the recommendation to use UTF-8 implies some other, ill-defined definition.

I am quite surprised that the result presented is so different from James's recent compromise proposal, which seemed poised to serve as the basis for a consensus result.  Perhaps a viable solution would be to include a definition of "text file" derived from that proposal.


