Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics. .. .

See comments below.

On Mon, Sep 13, 2010 at 10:52 PM, Herbert J. Bernstein
<yaya@bernstein-plus-sons.com> wrote:
> I would suggest actually writing the utility you have in mind.

Why?  It is simply a CIF2 syntax parser with checksum of all the
contents thus parsed.  It is not worth spending the time on something
so obviously possible until we agree that we want such a system.

> In practice, inasmuch as a CIF file looks like a text file, people
> are very likely to just pick one up in any convenient text editor
> change what they want to change and write an unidentified pseudo-cif
> file back out.  Anything else needs to be provided to them in
> a complete, platform portable, well-documented package they can
> use easily in place of an editor that they use all the time for
> everything else.

Note that we are not suggesting replacing editors, far from it, if we
could do that we wouldn't have a problem in the first place.

> Please be practical -- CIF is a working tool, embedded in the IUCr
> journal process flows, in many crystallographic applications, in
> the PDB workflows, in Dectris detector software, etc.,. etc.
> The more disruptive you make the transition from CIF1 to CIF2, the
> more software and documentation you need to create to allow people
> to make the transition actually happen.  We are essentially in
> the same place we were in Osaka.  How do we break out of this
> loop and move forward?

We are a lot further forward than Osaka.  We have a *complete* syntax
specification on the table, which has received zero objections outside
of this group.  No further DDLm problems have been identified. The
only issue left unresolved is that not enough encodings are allowed,
although the one encoding that is allowed is actually sufficient for
all of the useful work that the IUCr expect to do.  We could take what
we have to Madrid, with a single caveat that a system for dealing with
non UTF8 encodings is under consideration, and (if the response on the
mailing lists is any indication) everybody would be happy outside of
this list.  As for demonstrations, Nick and Syd have been
demonstrating this system for over a decade (with cosmetic differences
in syntax).

> We need a realistic plan to get our job done and have a complete
> specification with the necessary supporting software for CIF2 in place
> and ready to demonstrate for Madrid, or I would suggest we
> accept the failure of this effort, and start over.

Somewhat overwrought, don't you think?  Because we can't agree on a
scheme for additional encodings we should chuck CIF2 syntax, DDLm and
dREL overboard??  When the IUCr will function perfectly well with UTF8
only? If you would like to start coding, please structure your code so
that the decoding step may take other encodings beyond UTF8.  The rest
is in the draft standard (you will be pleased to see the lack of
ambiguity in that standard, it will make your task easier).

>   -- Herbert
> At 10:32 PM +1000 9/13/10, James Hester wrote:
>>The original concept was to edit the non UTF8 files in the text editor
>>of choice, then run a simple checksumming application (that
>>understands CIF2 syntax) to update the checksum.  This application
>>would also pick out sections of text that would be displayed
>>incorrectly in the wrong encoding, and ask the user to confirm that
>>the text was displayed correctly.  Such an application could be made
>>freely available by the IUCr.
>>On Mon, Sep 13, 2010 at 8:22 PM, SIMON WESTRIP
>><simonwestrip@btinternet.com> wrote:
>>>  I questioned:
>>>  "For example, if mandatory, does that mean it becomes impossible to create a
>>>  non-UTF8 CIF without using
>>>  CIF2-aware software?"
>>>  In some respects this might not be a bad idea - i.e.restricting the use of
>>>  non-UTF8 to CIF2-aware systems...
>>>  Simon (thinking aloud)
>>>  ________________________________
>>>  From: SIMON WESTRIP <simonwestrip@btinternet.com>
>>>  To: Group for discussing encoding and content validation schemes for CIF2
>>>  <cif2-encoding@iucr.org>
>>>  Sent: Monday, 13 September, 2010 11:05:12
>>>  Subject: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics. .. .
>>>  Yes - I beleive that such a declaration should be mandatory for all non-UTF8
>>>  CIF2 files,
>>>  and agree that a supporting checksum mechanism would be very useful to
>>>  CIF2-aware
>>>  programs. Until I've revisited the checksum scheme, I can not say that the
>>>  checksum should be mandatory too.
>>>  For example, if mandatory, does that mean it becomes impossible to create a
>>>  non-UTF8 CIF without using
>>>  CIF2-aware software?
>>>  I need to review the discussions on checksums and indeed the various forms
>>>  that such a declaration might take,
>>>  but I do beleive in the principle that it should be mandatory for all
>>>  'stand-alone' non-UTF8 CIF2 files.
>>>  If a CIF is packaged in a container, then it will be the job of non-CIF
>>>  software to retreive it from the container
>>>  and deliver it in its original form. So a non-UTF8 CIF packaged in a
>>>  non-UTF8 container (or even a UTF8 container)
>>>  should still carry its non-UTF8 declaration.
>>>  Cheers
>>>  Simon
>>>  ________________________________
>>>  From: James Hester <jamesrhester@gmail.com>
>>>  To: Group for discussing encoding and content validation schemes for CIF2
>>>  <cif2-encoding@iucr.org>
>>>  Sent: Monday, 13 September, 2010 6:24:42
>>>  Subject: Re: [Cif2-encoding] Splitting of imgCIF and other sub-topics. .. .
>>>  Hi Simon: the issue with such an encoding declaration is that it is
>>>  not supported by generic text tools, and so would not be automatically
>>  > inserted, updated or respected when creating, editing (ie open in one
>>>  encoding, save in another) or transcoding a CIF2 file.  This means it
>>>  has no status beyond a hint that could cause as many problems as it
>>>  solves. Such a declaration becomes more robust if accompanied by the
>>>  checksum that John B suggested.  The checksum gives some guarantee
>>>  that the encoding has been checked by a CIF-aware program.
>>>  If you are proposing that such a declaration and checksum be mandatory
>>>  for all non-UTF8 CIF2 files (not only during transfer), I agree with
>>>  you that this would be acceptable.
>>cif2-encoding mailing list
> --
> =====================================================
>  Herbert J. Bernstein, Professor of Computer Science
>    Dowling College, Kramer Science Center, KSC 121
>         Idle Hour Blvd, Oakdale, NY, 11769
>                  +1-631-244-3035
>                  yaya@dowling.edu
> =====================================================
> _______________________________________________
> cif2-encoding mailing list
> cif2-encoding@iucr.org
> http://scripts.iucr.org/mailman/listinfo/cif2-encoding

T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
cif2-encoding mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.