Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Variants

I'll make some opening comments regarding the idea of variants:

(1) It seems to me that the closer a given CIF file is to the raw data, the more useful recording of variants is, as the best path forward has not yet been identified and so keeping different variations is useful; conversely, when publishing what is supposed to be the final ("correct") result, the interest of the wider community will primarily be in this result, rather than alternative results that are considered by the author to be inferior.  Taking the wavelength proposal as an example: once someone has refined a wavelength from a standard material, what the nominal wavelength was is no longer scientifically relevant, and so there is no reason to keep it and all derived values in the file (which is why Nick's alternative wavelength proposal is preferable, as only one wavelength is in the file).

(2) Introducing variants means that multiple values for simple items such as cell parameters could be present in a single datablock, and CIF reading software must be rewritten to recognise which of those instances of cell parameters it needs to care about (not to mention all those programs which expect unlooped cell parameters...).  This is a very serious issue for small molecule CIF, where many programs already exist.  I don't expect that this is so serious for imgCIF, where (unfortunately) imgCIF applications are thin on the ground.

(3) What are our use cases for this change?  What is the motivation?  Perhaps Herb could speak to this.

(4) Introduction of DDLm and dREL may change the variant scheme such that only a limited set of variant values would need to be made available in any CIF data file, and a dREL engine could then calculate out the corresponding alternative derived values (and all combinations...).  But again, for published data, we expect the author to have done this already and chosen the best result.


On Thu, Nov 26, 2009 at 10:59 AM, James Hester <jamesrhester@gmail.com> wrote:
I'm reposting Herbert's message in a new thread to aid organisation.  Herbert wrote:

----
Dear Colleagues,

 While you are revisiting this item, I would suggest you consider the more complete (and, I believe, more elegant and general) solution of defining "variants", that we have introduced into the imgCIF dictionary to handled quantities that may be determined in different ways.

 Instead of adding

 _diffrn_radiation_wavelength_
determination

you would add

 _diffrn_radiation_wavelength_variant

and a new variant category


       _variant_variant
       _variant_role
       _variant_timestamp
       _variant_variant_of
       _variant_details

which would allow you with complete generality to manage any number
a refined or redefined quantities, such as wavelengths.  This would
then allow you to us the same variant identifier, for, say cell
dimensions, which could be expected to change in a coupled manner
with the changes in wavelength.

 If you are interested in this more complete approach, I can provide
you with the full item definitions, but the short form is:

       _variant_variant


             The value of _variant_variant must uniquely identify
             each variant for the given diffraction experiment and/or
             entry

       _variant_role

             The value of _variant_role  specifies a role

             for this variant.  Possible roles are null, "preferred",
             "raw data", and "unsuccessful trial".

       _variant_timestamp


             The date and time identifying a variant.  This is not
             necessarily the precise time of the measurement or
             calculation of the individual related data items, but a timestamp that
             reflects the order in which the variants were defined.

       _variant_variant_of


             The value of _variant.variant_of gives the variant
             from which this variant was derived.  If this value is not
             given, the variant is assumed to be derived from the default
             null variant.

       _variant_details


             A description of special aspects of the variant


An example of how this might be used is:

        loop_
            _diffrn_radiation_wavelength_id
            _diffrn_radiation_wavelength
            _diffrn_radiation_wavelength_determinaton
               1   1.23456   fundamental
               2   1.25      estimated


would become

         loop_
             _diffrn_radiation_wavelength_variant
             _diffrn_radiation_wavelength
                final   1.23456
                pelim   1.25

         loop_
             _variant_variant
             _variant_role
             _variant_timestamp
             _variant_variant_of
             _variant_details
             final preferred 2007-08-04T01:17:28 prelim refined
             prelim .        2007-08-03T23:20:00 . .

         loop_
            _cell_variant
            _cell_length_a
            _cell_length_b
            _cell_length_c
            _cell_angle_alpha
            _cell_angle_beta
            _cell_angle_gamma
            final  22.5 22.5 22.5 90. 90. 90.
            prelim 22.3 22.3 22.3 90. 90. 90.


 Regards,
   Herbert


=====================================================
 Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
       Idle Hour Blvd, Oakdale, NY, 11769

                +1-631-244-3035
                yaya@dowling.edu
=====================================================




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

Reply to: [list | sender only]