This is an archive copy of the IUCr web site dating from 2008. For current content please visit https://www.iucr.org.
[IUCr Home Page] [CIF Home Page] [mmCIF Home Page]

Re: _item.mandatory_code can be undefined! (plus other development

John Westbrook (jwest@ndbdev.Rutgers.EDU)
Fri, 11 Aug 1995 09:33:28 -0400


On Aug 11, 10:52am, Peter Keller wrote:
> Subject: Re: _item.mandatory_code can be undefined! (plus other developmen
> > There is some history to this issue
>
> Yes, I thought that there might be....
>
>                                     which is related to providing compliance
> > to earlier dictionaries and CIFs.   It was agreed that in order to provide
> > an easier integration with older dictionaries that there be a placeholder
> > definition for every item in the mmCIF dictionary.  This really results
> > in a large number of essentially redundant definitions for data items
> > that are children of other items. In these cases only the definition of
> > the data item and perhaps the item name have been specified in the mmcif
> > dictionary. Placing a default value on the mandatory code would result
> > in conflicting definitions for this attribute as in almost all cases these
> > items are part of the key for the category in which they reside.
>
> Yes, I see - I hadn't quite realised the implications of this. I suppose
> that ideally there ought to be some kind of inheritance mechanism.....
>
Yep.. I have a proposal on this issue which follows...
>                                                                  Rather
> > that load up all of the definitions with an additional mandatory code
> > attribute we have chosen to make this specification optional.
>
> OK, so as it stands, there are in effect four possible values of
> _item.mandatory_code:  yes, no, implicit and undefined, with undefined
> being the effective default. I have no fundemental problem with this -
> just that there is no way of knowing, from publicly available information,
> what an application is supposed to do with an item for which
> _item.mandatory_code is undefined. At the very least, this issue should be
> discussed in the documentation, and some sort of convention for dictionary
> developers proposed. This kind of thing makes it very difficult for people
> (like myself) who were not part of the original CIF/DDL effort, to write
> effective and robust applications and libraries. In this particluar case,
> if this isn't clarified, you risk running into what the compiler writers
> call 'implementation-dependent behaviour'. It is very dangerous to rely on
> the excercise of common sense by different people who are not in regular
> contact, to produce consistent results! You have been warned!
>
Your point is well taken, please see later comments on this..
>                                                             However,
> > you will note in the mmCIF dictionary that it is provided all data
> > items except in the case of redundant children.
>
> Actually, if you look, you will see that it is provided for all data items
> _including_ the case of redundant children, as well as in their parent
> item definitions. This is causing me problems, because there is nothing in
> the DDL documentation about how multiply defined properties of dictionary
> items are to be treated, and no mechanism (at least in publicly available
> documentation) which ensures that conflicts don't arise. Ah, well, I
> suppose that I shall just have to use my common sense....:-).
>
It is good that you bring up the point of the "update policy".  This was
in an early version of the DDL and was removed because it had too much
of a "database" flavor.  When I removed this I was of the mind set that
we would avoid such problems at the dictionary level because we would
not typically encounter redundant definitions.  This was in fact something
we set out to avoid.  Since that time the redundancy has crept back into
the dictionary and we are confronted with problem once again.

This actually relates to other criticisms that you raise regarding the
enforcement of a mandatory policy for data type and mandatory code.
One the major reasons for making these optional was to avoid the
the possibility of an inconsistent update in redundant data definitions.
>From my last look at the mmCIF dictionary it appears that most of these
instances now include a complete respecification of at least the
item category (which includes the mandatory code).  I would prefer if
we could simply not include the redefinition of the item category
if this were acceptable everyone and simply provide an item description
which may in some cases needs to be refined relative to the parent
description.

Back to updating... We are treating these situations from the software
perspective as overwrites.  When we encounter a  duplicate key in a
category we assume it updates the row associated with the key.  Consequently,
any respecification of a row must be complete. We further process the
update in the order in which we parse the definitions in the dictionary.
This is problematic as this should really be order independent, but
there should really not be any redundancy either.   We have experimented
with partial row updates, and simply discarding duplicate rows but
there are problems with both of these approaches.  I think that the
full update approach is the most reasonable.  From the standpoint of
checking the dictionary we have been printing diagnostics when a
row update occurs with a data item value conflict which does not involve
a NULL value.

Now there is a question about where this sort of information/rule should be
encoded.  If there are no objections I will add this into the description
for the category key for the moment.

An issue related to this is the propagation of an update throughout a
collection of categories parent/child relationships.  For instance, an
update of a _entity.id in the entity category could have profound
consequences down the structural hierarchy.  There are well know
ways of describing the alternative actions, but this sort of information
has been stripped out of DDL 2.1.   As Peter points out perhaps it
should at least be dealt with in some associated documentation.

>
> The section on the ITEM_LINKED category points out (quite rightly) the
> difficulties which can arise from cyclical linkages, but says nothing
> about conflicts such as:
>
> save__cat1.name
>     ....
>     _item.mandatory_code    yes
>
>     save_
>
> save__cat2.name
>
>     loop_
>     _item_linked.parent_name  _item_linked.child_name
>     '_cat2.name'              '_cat1.name'
>     ...
>
>    loop_
>    _item.name      _item.mandatory_code
>    '_cat2.name'    yes
>    '_cat1.name'    no
>    ...
>
>    save_
>
> Can I rely on SIFLIB (or other tools being used by dictionary developers)
> ensuring that this kind of thing doesn't happen, or do I have to check for
> this in my own applications? I think that those of us who are dictionary
> users (i.e. CIF application developers) as opposed to dictionary
> developers, should be told.
>
In this case we would detect an update and that there was a conflict in
the value of mandatory_code.   Simply relying on an overwriting update
in this case is problematic.

>
> Back to the common sense point. Those of you who were at Montreal, may
> remember that I said that there must be a forum for developers to discuss
> these, and other, issues, and to check their interpretations of the DDL
> and dictionaries. An example - is there someone out there who can clarify
> this point:
>
> _category.mandatory_code for the ITEM_TYPE category is no, i.e. it is not
> compulsory to define the type of a data item in a dictionary. So, if an
> application uses my library to request an item from a CIF file, and
> _item_type.code is undefined for that item in the dictionary, what is my
> library supposed to do? Refuse to process the item, and stop with an
> error? Assume a type of text, and let the application program sort out

As I explained before, we are in a catch 22 situation here.  If we make this
mandatory, then it must be respecified for each redundant item in the
dictionary.  We very much wish to minimize this sort of redundancy as
much is this is possible.   I appreciate the problems that this
causes with respect to data type ..  What if we expand the enumeration
for mandatory_code to include "mandatory/inherit" which formally specifies that
an item is required and can inherit the property of a parent.  If there is no
parent or the value is not specified then it there is clearly an error
condition.  This would involve a rather small change to the DDL that
would not require any changes in the mmCIF dictionary.

Regards..

John

-- 
****************************************************************************
*  John Westbrook                       Ph:  (908) 445-5156                *
*  Department of Chemistry              Fax: (908) 445-5958                *
*  Rutgers University                                                      *
*  PO Box 939                        e-mail: jwest@rutchem.rutgers.edu     *
*  Piscataway, NJ 08855-0939                                               *
****************************************************************************