Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Opinions on comments as part of the content

peter murray-rust wrote:
...
> Our approach allows the prettification and reordering through 
> stylesheets. With CIFDOM it is trivial (using XSLT) to reorder the 
> items and loops in whatever way you wish. It is possible, though more 
> difficult, to reorder columns in tables. There is no obvious way in 
> which rows of tables can be reordered. (but note that they are 
> intrinsically unordered)
It sounds like your use of a 'style sheet' is along the same lines as my
concept of adding format hints to the dictionary. Perhaps a style sheet
is better, because this allows customization for a users needs, rather
than the impossible task of universal agreement on the 'best' layout.

Given that CIF is not XML, I think it would be better to format the
style information as CIF. It might be easier to define features more
relevant to CIF (i.e. XML does not have arrays). It seems like it should
be easy to define a sort order for rows and columns if desired. Would
this fit into your style-sheet design?
...
>> Do the standards not state that names must be unique?
> 
> Indeed it does. This does not stop many authors duplicating names
Well, they are wrong. However, you did bring up a good point. Is this
valid if they are actually parts of the same data, but split up for some
reason.

...
> 
> NO! global_ is part of STAR but not CIF. That is part of the problem. 
> I don't know who invented data_global but it wasn't an agreed 
> heuristic. My own belief is that in a  file such as
> 
> data_global
>    content_g
> data_1
>    content_1
> data_2
>    content_2
> 
> the heuristics are:
> * this is semantically equivalent to two separate CIFs:
> 
> data_1
>    content_g
>    content_1
> 
> and
> 
> data_2
>    content_g
>    content_2
> 
> * This requires that no items in data_global have the same names as 
> any in data_1 or data_2. This is nowhere defined and should be
> * that the two CIFs have no other semantic relation other than any 
> that can be deduced from the common items in data_global
I know that global_ is not part of CIF, but neither is this hack for
using data_global. CIF says global_ is reserved for possible future use.
Obviously people want a global_, so let's include it.

...
> My own heuristics are:
> _foo '?'
> carries no useful information other than the author hasn't bothered 
> to remove it from the file
> _foo '.'
> is highly dangerous as the dictionary can contain default values 
> which most users have no idea of. Thus the default extinction 
> correction is (or certainly was)  'Zachariasen' and algorithmically 
> linking '.' to this value is certain to give misleading info.
> 
> loop_
> _foo _bar
> a .
> b c
> 
> has a null value for one cell - this is required to make a rectangular table
> 
> loop_
> _foo _bar
> a .
> b .
> 
> should be equivalent to
> loop_
> _foo
> a
> b
> 
> and this construct should be avoided
> 
> loop_
> _foo _bar
> a ?
> b ?
> 
> is almost certainly an unedited template and should be replaced by:
> 
> loop_
> _foo
> a
> b
> 
> and finally
> loop_
> _foo _bar
> a ?
> b c
> 
> is indistinguishable from
> 
> loop_
> _foo _bar
> a .
> b c
> 
> All these issues come into very sharp focus when processing CIFs - it 
> is not trivial to manage '.' in a column of otherwise real numbers.
> 
> P.
I take a similar approach. They both represent missing values, but
missing for different reasons. If one really wants a default value in
the dictionary, it should be "if not otherwise specified" and not "if
the value is '.'". In that case, both still mean missing, just different
reasons.

Does ANYBODY really think it is practical to have two types of undefined
values?

Of course, CIF is just a text archive. There is nothing preventing the
use of a string in the middle of an array of real numbers. Some rules
about numeric arrays would be helpful for practical use of CIF.

Joe


Reply to: [list | sender only]