Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] [ddlm-group] options/text vs binary/end-of-line. .. .. .. .. .. .. .. .. .. .. .. .. .. .. .

I will try not to make this exchange into another blockbuster!
Comments inserted below.

On Sat, Sep 4, 2010 at 1:36 AM, Bollinger, John C
<John.Bollinger@stjude.org> wrote:
> James: I am not ignoring our ongoing blockbuster exchange, but I have
> been unable to devote the time to it that it deserves.  In the mean
> time, I have a shorter response to these comments:
>
> On Thursday, September 02, 2010 11:22 PM, James Hester wrote:
>>I agree that CIF1 is not *defined* as ASCII-only, and I have no wish
>>to push for any redefinition.  I am stating that CIF1 is used by the
>>community *as if* it were ASCII-only.
>
> I think it's more accurate to say that CIF1 is used by the community
> under the assumption that CIFs comply with the default text conventions
> for the environment.  This is reasonable, because the CIF1 design
> assumes that exchange of CIFs between dissimilar environments
> involves conversion from one set of text conventions to another (sounds
> familiar?).  For example, CIF1 processors are not required to recognize
> non-native line termination semantics.
>
> CIF1's limited character repertoire and the great prevalence of
> ASCII-compatible character encodings make it tempting to describe that
> situation as de facto ASCII-only.  That is a mischaracterization,
> however, ignoring CIF1's assumption of text conversion accompanying CIF
> exchange.  That assumption makes a great difference if you want to
> design CIF software that is reasonably portable to systems that do not
> default to an ASCII-compatible encoding.

Yes, I might have been overstretching to characterise the current
situation as involving the community choosing to use ASCII, rather
than simply having to use ASCII.  Nevertheless, I believe it is still
fair to say that substituting UTF8 for ASCII reduces the restrictions
on CIF users, rather than increases them.

>> When speculating about the
>>community response to CIF2, the actual community response to the CIF1
>>standard is a perfectly reasonable starting point.
>
> Indeed, hence the continuing line of argument that users would want to
> continue to use CIFs encoded according to local convention, just as they
> already do.  The new and disruptive thing here is support for non-native
> encodings, which in most places include UTF-8.  I want UTF-8, but it's
> not free.
>
>>Are you suggesting that a CIF1 application that accepts only ASCII
>>encoding is not standards conformant?
>
> I am amused to see you arguing the other side of the "CIF software must
> accept all compliant CIFs" argument now :) .

Unfortunately the horse has well and truly bolted on the CIF1 standard
(with which I was not involved), so I have been attempting to use it
as a real-life test for what happens when optional behaviour remains
in the standard.  As you rightly point out, it turns out not to be a
particularly enlightening test, as the apparent open slather in CIF1
encodings essentially reduces to ASCII everywhere.  I do not yet
resile from my "CIF software must accept all compliant CIFs" line, but
it is too late (and pointless) to go back and fix CIF1 to make this
possible.

> I don't know about Herb, but I would find that program's behavior
> unacceptable if it were running on an EBCDIC-based computer.  The
> standard says almost nothing about program behavior, so I could not call
> the *program* non-conformant, but it would reject conformant
> (EBCDIC-encoded) CIFs that I would expect it to accept.

I take your point, and think I see in what sense my putative
ASCII-only CIF program would have been non-conformant to the CIF1
standard in an EBCDIC environment, hence Herb's cryptic (to me) remark
about non-conformance.

>>  Because all that I am asserting
>>is that useful CIF1 programs that support non-ASCII encodings are
>>either rare or non-existent, despite being allowed by the standard.  I
>>see no hint of non-standards-conforming programs in this description.
>
> I suspect that many CIF1 programs would in fact support a non-ASCII encoding
> just fine when used on a system where that encoding is the default.  In
> fact, I expect that many of them would fail on ASCII- (or UTF-8-)encoded CIFs
> in such an environment.  In other words, I believe that there are many useful
> CIF1 programs that support non-ASCII encodings, simply as a result of assuming
> default text conventions.  This is the difference between "ASCII-only" and
> "text".

I have strong doubts that such local defaults are sufficiently robust,
well-defined, and consistently used to allow us to follow CIF1 in this
respect.  And, even if such local defaults were reliable, the issue
would remain of writing portable CIF programs (portable between
language environments as well as operating systems) and transfer of
such files between each language environment/operating system.

CIF1 put these text conversion issues into the "somebody else's
problem" basket (e.g. let Chester figure out what to do with an
EBCDIC-encoded CIF), which perhaps was efficient and responsible when
the only contenders were "ASCII-compatible" and "almost-extinct
EBCDIC".

all the best,
James.
> John
> --
> John C. Bollinger, Ph.D.
> Department of Structural Biology
> St. Jude Children's Research Hospital
>
>
> Email Disclaimer:  www.stjude.org/emaildisclaimer
>
> _______________________________________________
> cif2-encoding mailing list
> cif2-encoding@iucr.org
> http://scripts.iucr.org/mailman/listinfo/cif2-encoding
>



-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
cif2-encoding mailing list
cif2-encoding@iucr.org
http://scripts.iucr.org/mailman/listinfo/cif2-encoding


Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.