Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Variants

I'll make some opening comments regarding the idea of variants:

(1) It seems to me that the closer a given CIF file is to the raw data, the more useful recording of variants is, as the best path forward has not yet been identified and so keeping different variations is useful; conversely, when publishing what is supposed to be the final ("correct") result, the interest of the wider community will primarily be in this result, rather than alternative results that are considered by the author to be inferior.  Taking the wavelength proposal as an example: once someone has refined a wavelength from a standard material, what the nominal wavelength was is no longer scientifically relevant, and so there is no reason to keep it and all derived values in the file (which is why Nick's alternative wavelength proposal is preferable, as only one wavelength is in the file).

(2) Introducing variants means that multiple values for simple items such as cell parameters could be present in a single datablock, and CIF reading software must be rewritten to recognise which of those instances of cell parameters it needs to care about (not to mention all those programs which expect unlooped cell parameters...).  This is a very serious issue for small molecule CIF, where many programs already exist.  I don't expect that this is so serious for imgCIF, where (unfortunately) imgCIF applications are thin on the ground.

(3) What are our use cases for this change?  What is the motivation?  Perhaps Herb could speak to this.

(4) Introduction of DDLm and dREL may change the variant scheme such that only a limited set of variant values would need to be made available in any CIF data file, and a dREL engine could then calculate out the corresponding alternative derived values (and all combinations...).  But again, for published data, we expect the author to have done this already and chosen the best result.

On Thu, Nov 26, 2009 at 10:59 AM, James Hester <jamesrhester@gmail.com> wrote:
I'm reposting Herbert's message in a new thread to aid organisation.  Herbert wrote:

Dear Colleagues,

 While you are revisiting this item, I would suggest you consider the more complete (and, I believe, more elegant and general) solution of defining "variants", that we have introduced into the imgCIF dictionary to handled quantities that may be determined in different ways.

 Instead of adding


you would add


and a new variant category


which would allow you with complete generality to manage any number
a refined or redefined quantities, such as wavelengths.  This would
then allow you to us the same variant identifier, for, say cell
dimensions, which could be expected to change in a coupled manner
with the changes in wavelength.

 If you are interested in this more complete approach, I can provide
you with the full item definitions, but the short form is:


             The value of _variant_variant must uniquely identify
             each variant for the given diffraction experiment and/or


             The value of _variant_role  specifies a role

             for this variant.  Possible roles are null, "preferred",
             "raw data", and "unsuccessful trial".


             The date and time identifying a variant.  This is not
             necessarily the precise time of the measurement or
             calculation of the individual related data items, but a timestamp that
             reflects the order in which the variants were defined.


             The value of _variant.variant_of gives the variant
             from which this variant was derived.  If this value is not
             given, the variant is assumed to be derived from the default
             null variant.


             A description of special aspects of the variant

An example of how this might be used is:

               1   1.23456   fundamental
               2   1.25      estimated

would become

                final   1.23456
                pelim   1.25

             final preferred 2007-08-04T01:17:28 prelim refined
             prelim .        2007-08-03T23:20:00 . .

            final  22.5 22.5 22.5 90. 90. 90.
            prelim 22.3 22.3 22.3 90. 90. 90.


 Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
       Idle Hour Blvd, Oakdale, NY, 11769


T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

Reply to: [list | sender only]