Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [Imgcif-l] proposed change in first line of imgcif files

Dear imgCIFers,

Before responding to Herbert's suggestions, are we agreed that the
information derived from the 'style' and 'version' parts of the header
would not contain anything that couldn't be derived from the CIF file
proper?  In which case I think the header proposal can go ahead
regardless of the fate of the current 'whitespace' proposal.  It's
just that it would seem to be implied from the 'whitespace' proposal
that there might indeed be something in the header that would not
otherwise be preserved, thus my question.

In response to Herbert's current 'whitespace preservation' proposal:

>   There is an important issue on the table -- how to handle critical
> "extra-CIF" data, such as magic numbers and whitespace and comments.
> For people generating the data, they are just magic numbers,
> whitespace and comments.  For some people, they are just noise
> to be discarded, but for some uses, there is essential information
> (such as how to parse a particular imgCIF file) buried in there, and
> it must not be lost.

I agree that any essential information shouldn't be lost.

[edited]

Herbert's proposal aims to preserve information in comments by the
simple expedient of defining a CIF dataitem that contains the text of
the comment.   But how much information is actually preserved?
Certainly the physical text is preserved, but the meaning of that text
is not.  That is, if _ws.prologue is "##CBF Version 5.4 ADSC 210" any
dictionary which defines _ws.prologue as "the whitespace and comments
before the datablock" will not enlighten anybody as to the actual
meaning of that prologue text (the same goes for all the other
suggested _ws datanames).  This is before even thinking about how to
rigourously delimit and collect comments at all those other positions
in the file.

As a counter-proposal, I would suggest the following: every field in
the comment header should be algorithmically derivable from CIF
datanames in the datablocks.  So, to return to Herbert's original
description where 'style' and 'style version' are the two new fields,
imgCIF datanames could be defined (if current names are insufficient):
_diffrn_detector.data_style and _diffrn_detector.data_style_version
.This then gives the dictionary writers an opportunity to specify
exactly what type of information might be contained in a 'style', and
to give a URL where the variations between the various versions are
available.

A CIF preprocessor utility could then extract these two values and
construct a convenience header.

In summary, my position is that a CIF with no comments whatsoever is
semantically equivalent to a CIF with comments.  If comments are to be
used to improve processing of a file (and I take Harry's point in his
email in this thread) it should always be possible to generate such
comments from a comment-free version of the CIF file, (even if in
practice the comments and the file might be created at the same time).
 _ws.prologue type datanames fulfill the letter but not the spirit of
this prescription, because they add no meaning.

James.

-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
imgcif-l mailing list
imgcif-l@iucr.org
http://scripts.iucr.org/mailman/listinfo/imgcif-l

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.