Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Opinions on comments as part of the content

In practice, CIF has one "type" of undefined value -- it is usually called
null.  Just as the number 1 has many representations (e.g. 1.0, 1, .1e1, etc.),
null has two representations: . and ?  From the point of view of a
program processing submitted data they have the same meaning -- the
user has not provided a value.  if you have a specified default, use
it.   If you don't have a specified default, you have an unspecified
value.  For the point of view of the user, they have two very different
meanings.  The "?" is an invitation to provide a value to be used
in place of the default, if any.  The "." is not such an invitation.

The question of whether

>  > > loop_
>>  > _foo _bar
>>  > a .
>>  > b .
>  > >
>>  > should be equivalent to
>>  > loop_
>>  > _foo
>>  > a
>>  > b
>>  >

depends on the dictionary.  If _bar has been declared mandatory
and has a default value, then the first construction is valid,
while the second is not.  If _bar has been declared implicit
then the two constructs are equivalent and, from the point
of view of an application, equivalent to

>  > > loop_
>>  > _foo _bar
>  > > a ?
>  > > b ?

While it may not be easy to deal with "missing" values in
a column of numbers, it is important in many cases to
be able to do so.

I for one, not only think it practical to have two ways to
represent an unspecified value, I think it to be essential.
The distinction between "?" and "." has worked rather well
as a way to be able to guide users in filling in appropriate
"blanks" without tempting them to override defaults that
should be left alone.

   -- Herbert

>
>>...
>>  > My own heuristics are:
>>  > _foo '?'
>>  > carries no useful information other than the author hasn't bothered
>>  > to remove it from the file
>>  > _foo '.'
>>  > is highly dangerous as the dictionary can contain default values
>>  > which most users have no idea of. Thus the default extinction
>>  > correction is (or certainly was)  'Zachariasen' and algorithmically
>>  > linking '.' to this value is certain to give misleading info.
>>  >
>  > > loop_
>>  > _foo _bar
>>  > a .
>>  > b c
>>  >
>>  > has a null value for one cell - this is required to make a
>>  rectangular table
>  > >
>  > > loop_
>>  > _foo _bar
>>  > a .
>>  > b .
>>  >
>>  > should be equivalent to
>>  > loop_
>>  > _foo
>>  > a
>>  > b
>>  >
>  > > and this construct should be avoided
>>  >
>>  > loop_
>>  > _foo _bar
>>  > a ?
>>  > b ?
>>  >
>>  > is almost certainly an unedited template and should be replaced by:
>>  >
>>  > loop_
>>  > _foo
>>  > a
>>  > b
>>  >
>>  > and finally
>>  > loop_
>>  > _foo _bar
>>  > a ?
>>  > b c
>>  >
>>  > is indistinguishable from
>>  >
>>  > loop_
>>  > _foo _bar
>>  > a .
>>  > b c
>>  >
>>  > All these issues come into very sharp focus when processing CIFs - it
>>  > is not trivial to manage '.' in a column of otherwise real numbers.
>>  >
>>  > P.
>>I take a similar approach. They both represent missing values, but
>>missing for different reasons. If one really wants a default value in
>>the dictionary, it should be "if not otherwise specified" and not "if
>>the value is '.'". In that case, both still mean missing, just different
>>reasons.
>>
>>Does ANYBODY really think it is practical to have two types of undefined
>>values?
>>
>>Of course, CIF is just a text archive. There is nothing preventing the
>>use of a string in the middle of an array of real numbers.
>
>If the CIF name occurs in a loop_ and is defined in a dictionary as a
>NUMB then all values must be valid real numbers. If defined as CHAR
>it can be sequence of legal characters (there may be length restrictions).
>
>>Some rules
>>about numeric arrays would be helpful for practical use of CIF.
>
>P.
>
>
>Peter Murray-Rust
>Unilever Centre for Molecular Sciences Informatics
>University of Cambridge,
>Lensfield Road,  Cambridge CB2 1EW, UK
>+44-1223-763069
>
>_______________________________________________
>comcifs mailing list
>comcifs@iucr.org
>http://scripts.iucr.org/mailman/listinfo/comcifs



Reply to: [list | sender only]