Specifying values 'less than something' in CIFs?

  Saulius Grazulis
  Sun, 29 Apr 2012
Dear COMCIFS members,

I am currently trying to validate all Crystallography Open Database CIFs
against the IUCr core dictionary.

A large amount of value type violations come from data items like this:

_refine_ls_shift/su_mean <0.001

(see e.g. http://www.crystallography.net/2232747.cif)

The data type in the core dictionary is specified as 'numb', but many
CIFs give string ('char') values, because of the attached "less than" sign.

For a human reader, the message in these data items seems more-or-less
clear: in interpret it as if the authors wanted to convey that they are
"pretty sure that the value negligible and can be treated as 0 for all
practical purposes; with very high probability it is less than <0.001"

How do we express this in CIF dictionary-consistent way?

One possibility would be to put in the value 0 (this is the lowest
possisble value for the _refine_ls_shift/esd_mean and other such tags),
denoting that in computations, the values (shifts) can be neglected;
then we could reason that since the authors put '<0.001' they are pretty
sure about it, so the probabilities for this to be true are above 99%;
therefore, if the measured values were normally distributed around the
mean 0, 0.001 would be something like 3*sigma ("the three sigma rule"),
and thus the esd would be 0.001/3 approx. = 0.0003. This would yield the
CIF encoding:

_refine_ls_shift/su_mean 0.0000(3)

Of course the values can not be negative, and we are not sure about
normality, and we are not sure about how precisely authors have
estimated the shifts and what confidence intervals they had in mind, but
since we do not have any more reliable estimates of standard deviation
for this value, the above notation should convey about the same message
as '<0.001', but in a CIF-consistent way.

I think such encoding should not confuse any valid CIF readers -- what
about you? Do you have any other suggestions how facts 'value is less
than ....' could/should be recorded?

I would like to run automatic conversion on COD and replace all similar
data items in a consistent and transparent way, so that the validation
messages for these data items do not obscure more serious problems.

Sincerely yours,

Dr. Saulius Gražulis
Institute of Biotechnology, Graiciuno 8
LT-02241 Vilnius, Lietuva (Lithuania)
fax: (+370-5)-2602116 / phone (office): (+370-5)-2602556
mobile: (+370-684)-49802, (+370-614)-36366
