This is an archive copy of the IUCr web site dating from 2008. For current content please visit https://www.iucr.org.
[IUCr Home Page] [CIF Home Page] [mmCIF Home Page]

ITEM_TYPE_LIST (was RE: TER and MODEL)

Peter Keller (bsspak@bath.ac.uk)
Wed, 27 Sep 1995 09:56:18 +0100 (BST)


> >is that there are no coordinates to go with the TER.  To be consistent
> >with the rest of the examples in cifdic.m95, we should fill the
> >missing coordinate columns with a period, but that means we have to
> >ensure that "." is a valid "float" _item_type.code.  Fortunately,
> >"float" does not yet seem to be formally defined in dd1 2.1.0, so I
> >would suggest choosing a syntax for float which treats "." as an
> >acceptable "float".  Then TER would be OK.

The contents of the ITEM_TYPE_LIST category in mmCIF is still a major 'low
level' problem (by that, I mean something which is independent of any
considerations of proteins, crystallography, etc). In its current form, it
makes robust lexical analysis of macromolecular CIF's impossible. I think
(hope) that Paula, John and the others are including this in issues for
discussion round about now. 

The fact that float is not defined in the DDL, only means that the DDL
items themselves don't have a special float type. Float _is_ defined in
the dictionary itself, and hence is defined for items used in CIF's. The
definition is around line 27100, and is: 

               float     numb
              '-?(([0-9]+)|([0-9]*\.[0-9]+)([eE][-+]?[0-9]+)?)'
;              int item types are the subset of numbers that are the floating
               numbers.
;

[ By the way, this is another typo: that should read:

;              float item types.......

]

As I'm sure you can see, this construction doesn't allow '.' Neither does 
it allow '?'. My working method so far as been to take it as read that 
these two characters are 'universals', and should be checked for before 
any attempt to interpret _item_type.code for a particular item.

There are other potential problems with this, but I'll put them on hold
for the time being, because I know that COMCIFS are talking about a whole
range of issues around now. We'll see what comes out of these discussions. 

[snip]

> 	the PDB format.  Quite a number of things in the PDB format
> 	are done the wrong way. 

I agree with Dale on this general point. It shouldn't be necessary to 
perpetuate PDB-format kludges, no matter how ingenious.

Cheers,
Peter.