Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Items for the Agenda of the COMCIFS closed meeting

  • To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
  • Subject: RE: Items for the Agenda of the COMCIFS closed meeting
  • From: "Dr. John Faber" <faber@ICDD.com>
  • Date: Mon, 21 Mar 2005 16:31:12 -0500
Thanks very much for these agenda items.  We do have ongoing discussion
at ICDD on the use of CIF in our business practices for databases.  I
will be prepared to dicuss some of these issues.

John Faber

John Faber
Phone: 610-325-9814 extension: 20

-----Original Message-----
From: comcifs-bounces@iucr.org [mailto:comcifs-bounces@iucr.org] On
Behalf Of David Brown
Sent: Monday, March 21, 2005 3:17 PM
To: Discussion list of the IUCr Committee for the Maintenance of the CIF
Standard (COMCIFS)
Subject: Items for the Agenda of the COMCIFS closed meeting

To members of COMCIFS


I would like to place the following two topics on the agenda for the 
closed meetings in Florence.  I welcome suggestions for other agenda

1. What is the role of CIF in the current rapidly changing world of 
information technology?

2. How can we make transparent the boundary between CIFs written with 
DDL1 dictionaries and those written with DDL2?

David Brown



It should be no surprise that an information technology language adopted

in 1990 needs to be reviewed after fifteen years of operation.   The 
rapid advances in the field and the introduction of XML make such a 
review more than timely.  A further urgency is added by the need to 
ensure that incremental changes that we make in the dictionaries and 
other documents are compatible with future directions of 
crystallographic information technology.  Two current problems 
illustrate how this impacts on dictionary structures.

1. Is it better to have a semantically meaningless item as the 
_list_reference (DDL1) or _category_key (DDL2) to label each line in a 
loop, or should we use semantically meaningful items (such as 
_atom_site_label) that are already present?  The former solution allows 
a more straightforward programming and avoids possible conflicts between

the information technology and crystallographic use of the item, but the

latter leaves the CIF less cluttered and easier for humans to follow 
because the links are more readily followed by eye.  The current 
revision of the core dictionary needs an answer to this question, 
because the answer will affect future CIF data structures.

2. Should there be rules defining the relationships that are allowed to 
be expressed by parent-child links?  These links have been developed in 
an ad hoc way, but as we move towards more advanced data structures, we 
may find that we have developed links that are impossible to 
manipulate.  One way of exploring the logic of the linked structures is 
to use the ResourceDescriptionFramework (RDF) which is being developed 
as part of the Semantic Web (see http://www.w3.org/RDF/ and 
http://www.w3.org/RDF/FAQ ).  This scheme expresses the parent-child 
links as a graph making it easier to trace the logic.  Another 
possibility is to use the Unified Modeling Language ( www.uml.org ).

As interest focuses on software that explores the interactions of small 
and large molecules, the incompatibility between the Dictionary 
Definition Language 1 (DDL1) and DDL2 is becoming a hindrance.

CoreCIF is designed for use with small molecules and is written in DDL1 
but mmCIF designed for reporting macromolecules is written using DDL2.  
While most of the features of the two standards are similar, there are 
two significant differences:  Firstly DDL2 has a tighter structure 
designed to make automatic computer manipulation of the information 
easier, secondly the names given to the data items have a different 
structure.  As the similarities between the two languages are far 
greater than their differences, it should be possible to achieve some 
convergence;  already the core dictionary is evolving towards the DDL2 
standard, but a complete convergence would require major reworking of 
some dictionaries.

Convergence can be achieved in different ways.  One way is to ensure 
that software is able to validate CIFs against both DDL1 and DDL2 
dictionaries, and since the dictionaries contain synonyms of the data 
names (alternative data names for items with essentially the same 
definition, listed under _related_item (DDL1) and 
_item_aliases.alias_name (DDL2)), any character string used to represent

a particular data name should be recognized by software that takes note 
of any alias names present regardless of the dictionary or version being

used.   Since all the items in the coreCIF dictionary appear 
(transformed to DDL2) in the mmCIF dictionary with their original DDL1 
data names given as aliases, mmCIF software should be able to read 
coreCIFs without difficulty.  mmCIF aliases are currently not present in

the coreCIF dictionary but could easily be added.  Alternatively, a DDL2

version of the coreCIF dictionary could be separated out and used as an 
alternative to the DDL1 core dictionary.

Reply to: [list | sender only]