Discussion List Archives

[Date Prev][Date Next][Date Index]

Dictionary introductory sections; core/extension relationships

  • To: COMCIFS@uk.ac.iucr
  • Subject: Dictionary introductory sections; core/extension relationships
  • From: bm@uk.ac.iucr (Brian McMahon)
  • Date: Mon, 27 Sep 93 14:42:59 BST
Dear Colleagues

Our technical discussions were initiated a couple of weeks ago by Paula's
request for handling the introductory sections to the CIF Dictionary.
Since then, a few of us have discussed aspects of this behind the scenes,
and now seems an appropriate time to summarise what has gone before and
invite your further input.

The first concern was replacement of the terminology *_appendix (first
introduced into the original Core Dictionary as _chemical_formula_appendix)
by *_intro as being a more explanatory phrase. Paula is happy with this,
but David has a further suggestion:

D>    At the meeting in Beijing we decided to replace the term *_appendix by
D> *_intro, but on more mature consideration I thing that this term is just
D> as confusing.  There will be a temptation to users to think they can set
D> up a data field *_intro in which they can write a little introduction to
D> the problems of their data stored in this category.  I would like to
D> propose that we replace *_intro (or *_appendix) with
D> *_dictionary_definition or *_dict_def.  This would then be in the category
D> dictionary_definition, and should make it clear that it is not a data field
D> that can be used within a normal cif.  I would welcome comments on this
D> proposal.

If one pays attention to the software dictates of the dictionary, the use to
which the introductory paragraphs may be put is adequately flagged by the
category of the definitions (and "dictionary_definition" will do nicely for
that) as well as the data type, which is proposed to be "null". However, the
opportunity for linguistic confusion that David identifies does exist, and I
have encountered an example of someone who intended to use such an *_intro
section in just such a way, i.e. to introduce his data collection problems!

Hence "_dictionary_definition" has an appeal - but is too long. "_dict_def"
is more opaque, but at 9 characters is the same length as the existing
"_appendix". Note now that in the draft macromolecular dictionary already
two *_appendix entries are of the maximum allowed length of 32 characters
(the identification of which two is left as an exercise to the interested 
reader!). This is relevant because of what follows next. Paula's first
message suggested:

P> I have a problem with moving all *_intro items for core categories to the
P> core - that leaves us no place to put the mm specific examples in the
P> mm dictionary, and no place to list the related items in each category that
P> are in the core (we need this to be sure that everything is provided for.)
P> 
P> .. possible solution .. have dictionary specific _intro items where needed. 
P> For instance, _atom_site_appendix could become _atom_site_mm_intro and you
P> could move _atom_site_intro into the core. The problem with this is keeping
P> the real definition part ... consistent between the two dictionaries.

This approach is now Paula's preferred method, with the use of the _mm tag
to indicate extensions appropriate to this dictionary. (It also runs into 
potential problems over the 32-character limit.) One could certainly have
_mm_dict_def terms, but the problems with string length are correspondingly
more severe. I have suggested reverting to the use of _appendix for
supplementary information in extension dictionaries, but this would also need
to be tagged with an identifier for the particular extension dictionary in use.

May I therefore poll your views on the nomenclature of these sections and any
problems you can foresee with this schema of a core description and subsequent
appendices (or whatever!) modifying or extending that for the extension
dictionaries?

May I also ask you to reflect on another of the points Paula has raised:

P> My mail of last week to the whole committee raised the issue of the
P> relationship of a extension dictionary to the core.  The particular 
P> issue was the *_intro items, but the issue is more general.  

There are various aspects to this, some of which we shall need to explore.
Let me ask first a specific question: should the long-term goal be a single
CIF dictionary that includes all the extension items now under development,
or should the Core and whatever ancillary dictionaries are necessary be
loaded into applications software as separate files? (I already know how
some of you will respond to this!)

Sorry if this has been excessively verbose - I've tried to develop the thread
of the argument for those who are relatively new to it, but do let me know if
you prefer a more succinct secretary!

Regards
Brian