Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains



On Wed, Mar 9, 2011 at 11:29 PM, Doug <doug.duboulay@gmail.com> wrote:

One CIF feature that no other software language supports natively
are measurement numbers, with their SUs. Maybe they should be encoded as
tuples for wider compatibility?

1.234(45) puts a significant, but not impossible, additional burden on the programmer. The string is not parsable by mainstream software. There are two possibilities:
* throw the su away. I suspect most people do this. I have certainly done it in some cases
* build a data structure that holds it. Our CIFXML and CML do this (and another opportunity to thank CIF for influencing CML). CIFXML has a "su" attribute that holds the su value. The software has to work out what the su value is in absolute (not relative) terms - i.e. the value above is 0.045.
CMLScalar has a range of tools for managing numeric annotation (min, max, error and errorBasis). It can carry this through a transformation process that does not change the values (i.e. output what it read in. The CML software (JUMBO) does not generally do error-proa]pagation though the data structure is set up to do it.

It's an illustration of how each particular syntactic construct, though apparently simple, builds up to a level where it becomes unimplementable. A typical example is the use of \' to add accents. If this is solely typographic then this is manageable. But suppose I have "\'Ecole". Do I hold the \'E as
* 1-character (and have to struggle to find the Unicode point)
* 2-character (e-acute codepoint and E)
* 3- character (backslash, ' and E)
and is it different in France and Canada?

The CIF spec does not say how to implement the construct and aalmost ceryainly different people will use different incompatiable approaches. And I gave up trying Hungarian diacritics. There is a limit for everyone :-)

P.

P.

--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Reply to: [list | sender only]