Discussion List Archives

[Date Prev][Date Next][Date Index]

(77) Core extensions; mmCIF; absolute structure; multipole coeffs; DDL

  • To: COMCIFS@iucr
  • Subject: (77) Core extensions; mmCIF; absolute structure; multipole coeffs; DDL
  • From: bm
  • Date: Mon, 24 Nov 1997 15:27:23 GMT
Dear Colleagues

Thanks for your ready responses to last week's circular. This time I have
some feedback (from David Brown, "D>" below), and some new issues of both
content and policy.

Existing discussion threads
===========================
D76.1 Core extensions
---------------------
D> 	A quick look through the proposed core changes raises the
D> following queries and comments:
D> 
D> *_obs
D> 	These items are all referenced to *_gt.  Since we have decided to
D> go for *_gt, why are we now introducing *_obs items for the first time? 
D> Surely it is bad enough to build in obsolescence without actually
D> introducing data names that are already obsolete before they are defined! 
D> (Does 'obs' stand for obsolete? :).  The same comment applies to
D> *_shift/esd_*. 

The reference is through the DDL1.4 terms "_related_item" and
"_related_purpose  replace". These are added to the *old* item, not the new
one. Although this may seem counter-intuitive, the idea is that a CIF reader
working through archival data will be pointed forward to the definition that
supersedes the older one. DDL2 has a better bidirectional formalism, where
the related functions are "replaces" and "replacedby", and I think this
would be an improvement that one might consider for DDL version 1.5. No new
'obs' terms have been introduced.

D> _refine_ls_wR_factor_gt
D> 	We discussed the definitions of R factors some time ago and I
D> thought we had decided that there was only one possible definition of wR,
D> namely what here is called *_wR_factor_ref.  Any omitted reflections are
D> those in which w is set equal to zero, so, since w is presumably defined
D> for each reflection, *_wR_factor_gt is a nonsense and should not be
D> allowed into the dictionary (this would also exclude *_obs).  If it is
D> absolutely essential to include this definition for historical reasons, is
D> there anyway we can point out its mathematical absurdity and
D> inappropriateness for any valid crystallographic purpose? 

There are many instances of *_obs in the Acta archive. The *_gt form was
introduced solely to yield a parallel translation to the other *_obs->*_gt
changes. I am willing to entertain the idea that we do not introduce the
*_gt form, but instead add to *_obs the entries
    _related_item     '_refine_ls_wR_factor_ref'
    _related_purpose    replace
as a pointer to (database querying) software to consider the *_ref name as a
replacement for *_obs (though the definitions are in fact slightly
different). Are there any objections to this?

D> _refine_ls_shift/su_mean
D> 	The related item I assume should be *_shift/esd_mean, not
D> *_shift/su_mean.

I've deleted the "_related_item" entry altogether in this case.

D> _reflns_number_Friedel
D> 	The last line of the definition should read 'anomalous scattering'
D> not 'inelastic scattering'.

OK, I've made that change. 

D> _reflns_threshold_expression
D> 	Would it be better to say 'USUALLY based on multiples of' or
D> 'based on FUNCTIONS of'?  I do not see any need to restrict the nature of
D> the function being defined in such a drastic way, even if it does cover
D> 99% of current usage.

OK:
    _definition
;              The threshold, ususally based on multiples of \s(I), \s(F^2^)
               or \s(F), that serves to identify significantly intense
               reflections, the number of which is given by _reflns_number_gt.
               These reflections are used in the calculation of
               _refine_ls_R_factor_gt.
;


D76.2 Maintenance of the mmCIF dictionary
-----------------------------------------

D> 	I am glad to see the mmCIF committee getting themselves organised. 
D> It bodes well for the future of the dictionary, since there will
D> undoubtedly be a large number of minor modifications that will need quick
D> action as the dictionary is brought into use. 
D> 
D> 	I am a little concerned, however, about the lack of reference to
D> the IUCr ownership of CIF, and COMCIFS role as agent, in the Plan for an
D> editorial board. True, the procedures do state that the final version of
D> the dictionary will be referred to COMCIFS for approval.  The timetable
D> seems to take preliminary approval as a technicality, allowing only 15
D> days for this step.  This is hardly enough time for members of COMCIFS to
D> study the details of the proposal and certainly does not allow for any
D> controversial issues to be raised.  In all likelihood none will be raised,
D> but COMCIFS approval, even preliminary approval, should not be taken for
D> granted.  Having said this, let me add that I will do what I can to
D> expedite the approval of any dictionary through COMCIFS.  I appreciate the
D> importance of time as a factor in the approval of these dictionaries. 
D> 
D> 	The Plan for the Editorial Board needs explicitly to recognise
D> that CIF is the property of the IUCr and is managed on behalf of IUCr by
D> COMCIFS.  COMCIFS has not only the responsibility for putting a stamp of
D> approval on the dictionaries that are brought before it after checking
D> that they conform to the rules of DDL and CIF, but it also has the
D> responsibility for ensuring that they conform to the other policies of
D> COMCIFS, in particular those relating to compatibility between
D> dictionaries. If all dictionary committees treat COMCIFS approval as a
D> formality, it will not be long before the various dictionaries will evolve
D> along separate paths and become mutually incompatible.  We have already
D> moved some way along this road with the simultaneous use of DDL1 and DDL2,
D> which requires that, at the very least, the core dictionary must be
D> maintained in two distinct versions.  However, this decision was a
D> decision of COMCIFS and was taken after a certain amount of soul
D> searching.  COMCIFS has the responsibility for approving any extensions to
D> the CIF rules and the use of extensions to the DDL rules, and this
D> approval must be obtained before these extensions are incorporated into a
D> revised dictionary.  It would be well to spell out quite clearly the
D> respective roles of the Editorial Board and COMCIFS.  It is important that
D> the mmCIF Editorial Board have a statement in their constitution that
D> identifies their role relative to the IUCr as owners of cif and COMCIFS as
D> the agent of IUCr.  This statement should appear near the beginning of
D> item 1. in the 'Plan for Extending the mmCIF Dictionary' as it was given
D> in Circular 76.
D> 
D> 	With this addition, the Plan will provide a good model for other
D> committees that are set up to extend and maintain the dictionaries that
D> COMCIFS has approved.


New discussions
===============

D77.1 Multipole population coefficients
---------------------------------------
I received this inquiry from Paul Mallinson, who is Secretary for the XD
program package:

P> The user group of the Charge Density Analysis program package XD recently 
P> raised the question of how to deal with multipole population coefficient data
P> in manuscripts submitted for publication. At present they often appear as 
P> a Table created via a LaTex file which is output from XD. An alternative 
P> would be to include these data in the Crystallographic Information File.
P> As far as the XD programming group is aware, there is at present no 
P> corresponding cif definition. I understand that the CIF content is being 
P> reviewed at present, and I would be grateful for your advice.

Does anyone have any suggestions on this. Perhaps Mark Spackman's work on
electron density CIF terms may thave thrown up some ideas? Mark, are you
able yet to give us any indication of overall progress with your project?

D77.2 Absolute structure
------------------------
Howard Flack was in Chester recently, mostly on business connected with the
Electronic Publishing Committee. He has also been doing some work for the
Cambridge Database on clarifying the interpretation of the Flack x parameter,
and while in Chester he reviewed the way this was reported in the CIF
archive. The following remarks arise out of this.

H> Suggested changes to the CIF core for AS: (version 18th Nov)
H> ==========================================================
H> (1) For "data_refine_ls_abs_structure_Flack" I suggest the amended form
H> given below. The justification is as follows, u being the standard
H> uncertainty of x.
H> 
H> For centrosymmetric structures the value of the parameter is undefined.
H> The producer of a CIF has two options: (i) leave out the data name and
H> corresponding value altogether or (ii) include the data name with a
H> value of 'inapplicable' represented by '.' .
H> 
H> For noncentrosymmetric structures, the physical range of the parameter
H> is 0 =< x =< 1 but statistical fluctations in the observations lead to
H> statistical fluctations in the value of x obtained by least-squares
H> refinement which may thus lie outside the physical range within a few
H> standard uncertainties. In fact as the vast majority of samples are
H> indeed single crystals the most common values of x are 'close' to either
H> 0 or 1, the boundaries of the physical range.
H> 
H> Within the current CIF dictionary definitions, it is not possible to
H> express the above interval of x correctly in _enumeration range. I have
H> taken the liberty of breaking the rules in order to express my intent.
H> 
H> data_refine_ls_abs_structure_Flack
H>     _name                      '_refine_ls_abs_structure_Flack'
H>     _category                    refine
H>     _type                        numb
H>     _type_conditions             esd
H>     _enumeration_range           -3.0*u:1.0+3.0*u
H>     _definition
H> ;            The measure of absolute structure (enantiomorph or polarity) as
H>              defined by Flack.
H>  
H>              For centrosymmetric structures the only permitted value, if the 
H>              data name is present, is 'inapplicable' represented by '.' .
H> 
H>              For non-centrosymmetric structures the value must lie in the
H>              99.97% Gaussian confidence interval  -3u =< x =< 1 + 3u and a
H>              standard uncertainty (e.s.d.) u must be supplied.
H> 
H>              Ref: Flack, H. D. (1983). Acta Cryst. A39, 876-881.
H> ;

I've changed the wording of the definition. The enumeration range I have
left unchanged as 0.0:1.0, because - as Howard points out - the formalism is
not sufficiently well developed to provide an unambiguous machine-parsable
instruction to validation software. It may be that this can be done through
association of "methods" (i.e. procedures expressed in some agreed computer
language or metalanguage); there is already provision for such an approach
in DDL2, but it is as yet only a placeholder - I'm not aware of any well
developed applications that use embedded methods

H> (2) In the example [for category REFINE], I have made two suggested changes:
H> 
H> (2a) The line
H>           _refine_ls_abs_structure_Flack     0
H> has been removed. According to the comment in 
H> _refine_ls_abs_structure_details an absolute configuration has been
H> assigned, not obtained using x by least-squares refinement. There is
H> thus no standard uncertainty on x and no information available that the
H> assigned value is confirmed by the diffraction measurements. In fact x
H> has not been used at all. In such cases the use of the 
H> _refine_ls_abs_structure_Flack data name seems completely inappropriate.
H> 
H> (2b) 
H>     _refine_ls_abs_structure_details
H>     ;      The absolute configuration was assigned to agree
H>            with the known chirality at C3 arising from its
H>            precursor l-leucine.
H>     ;
H> See the modified version below. Glossing over the inclusion of the
H> letters 'ls' - 'least squares' in the data name, despite the fact that
H> in this case nothing concerning the absolute structure has been obtained
H> from the diffraction data by least squares, the wording has been changed
H> to remove 'chirality'. In the physics and chemical literature, (Lord
H> Kelvin; Whtye; Cahn, Ingold & Prelog, Prelog), the notion of a chiral
H> object is always clearly defined and in essence the same: [[(VP76) "An
H> object is chiral if it cannot be brought into congruence with its mirror
H> image by translation and rotation.  Such objects are devoid of symmetry
H> elements which include reflection: mirror planes, inversion centers, or
H> improper rotational axes." ]]. On the other hand the above authors do
H> not offer an unequivocal definition of chirality and its exact or
H> implied meaning is obscure. Glazer and Stadnicka even go as far to use
H> chirality to mean optical activity for which the symmetry restrictions
H> are different from those of a chiral object. So I prefer not to use
H> chirality. On the other hand the term 'absolute configuration', although
H> horrid (like absolute structure), is clear in the chemical literature
H> although not always applied rigorously in publications of the IUCr.
H> 
H> 
H> ############
H> ## REFINE ##
H> ############
H> 
H> data_refine_[]
H>     _name                      '_refine_[]'
H>     _category                    category_overview
H>     _type                        null
H>     loop_ _example
H>           _example_detail
H> # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
H> ;
H>    ...
H>     _refine_ls_extinction_expression
H>     ;  Larson, A. C. (1970). "Crystallographic Computing",
H>        edited by F. R. Ahmed. Eq. (22) p. 292. Copenhagen: Munksgaard.
H>     ;
H>     _refine_ls_abs_structure_details
H>     ;      The absolute configuration was assigned to agree with that of 
H>            its precursor l-leucine at the chiral centre C3.
H>     ;
H>     _refine_ls_number_reflns           1408
H>     _refine_ls_number_parameters       272
H>     _refine_ls_number_restraints       0
H>     _refine_ls_number_constraints      0
H>     ...

I've made the suggested changes to the example.

H> (3) I'm beginning to wonder why the chemical crystallographers and the
H> chemical data base people have not started to cry out for an
H> 'absolute_configuration_determined' CIF data name.

Would you like to propose one, with appropriate definition and enumerations?


D77.3 Management of CIF DDLs
----------------------------
In my response to David's remarks about the "_related_function" entries
above, I suggest a small enhancement to DDL1.4 (use of "replaces" and
"replacedby"). I have one or two other suggestions I would like to make for
the next DDL revision. However, there is no formal relationship between the
responsibilities of the CIF Committee and the development of Dictionary
Definition Languages. Hence, while COMCIFS may argue at great length the
nuances of a CIF definition, it has no control over the underlying formalism
of the dictionary. This seems to some extent unsatisfactory. In part this
has arisen because the original DDL (now at version 1.4) was developed by
Syd and Tony Cook (and subsequently with the help of Nick Spadaccini) to
cover applications outside the CIF arena (DDL1.4 has hooks for nested loops,
for instance). The desire was to have the freedom to develop the DDL across
a wider range of STAR applications.

However, given that we are currently working with two flavours of DDL, the
choice of dictionary formalism relevant to any CIF dictionary is itself
an application issue. Would the DDL development teams consider it
appropriate to maintain DDL dictionaries under the jurisdiction, as it were,
of COMCIFS? That is, the existing DDL1.4 and DDL2.1 dictionaries should be
managed as CIF applications by COMCIFS subcommittees. Extensions to these
dictionaries would need to be approved by COMCIFS. (This would not, however,
prevent development of DDL dictionaries in different directions for other
applications; but such divergent dictionaries would not be used for CIF
purposes.)

My thanks to John Westbrook for raising this point in private discussion.


Regards
Brian