[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains

To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <[email protected]>
Subject: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains
From: Peter Murray-Rust <[email protected]>
Date: Thu, 10 Mar 2011 11:24:17 +0000
Cc: Doug <[email protected]>
In-Reply-To: <[email protected]>
References: <[email protected]><[email protected]>

On Wed, Mar 9, 2011 at 11:29 PM, Doug <[email protected]> wrote:

One CIF feature that no other software language supports natively
are measurement numbers, with their SUs. Maybe they should be encoded as
tuples for wider compatibility?

1.234(45) puts a significant, but not impossible, additional burden on the programmer. The string is not parsable by mainstream software. There are two possibilities:
* throw the su away. I suspect most people do this. I have certainly done it in some cases
* build a data structure that holds it. Our CIFXML and CML do this (and another opportunity to thank CIF for influencing CML). CIFXML has a "su" attribute that holds the su value. The software has to work out what the su value is in absolute (not relative) terms - i.e. the value above is 0.045.
CMLScalar has a range of tools for managing numeric annotation (min, max, error and errorBasis). It can carry this through a transformation process that does not change the values (i.e. output what it read in. The CML software (JUMBO) does not generally do error-proa]pagation though the data structure is set up to do it.

It's an illustration of how each particular syntactic construct, though apparently simple, builds up to a level where it becomes unimplementable. A typical example is the use of \' to add accents. If this is solely typographic then this is manageable. But suppose I have "\'Ecole". Do I hold the \'E as
* 1-character (and have to struggle to find the Unicode point)
* 2-character (e-acute codepoint and E)
* 3- character (backslash, ' and E)
and is it different in France and Canada?

The CIF spec does not say how to implement the construct and aalmost ceryainly different people will use different incompatiable approaches. And I gave up trying Hungarian diacritics. There is a limit for everyone :-)

P.

P.

--
Peter Murray-Rust
Reader in Molecular Informatics
Unilever Centre, Dep. Of Chemistry
University of Cambridge
CB2 1EW, UK
+44-1223-763069

Reply to: [list | sender only]

References:

Advice on COMCIFS policy regarding compatibility of CIF syntax withother domains (James Hester)

Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains (Doug)

Prev by Date: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains

Next by Date: RE: Advice on COMCIFS policy regarding compatibility of CIFsyntaxwith other domains.. .

Prev by thread: Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains

Next by thread: 2010 COMCIFS annual report - request for information

Index(es):

Date

Thread

Discussion List Archives

Re: Advice on COMCIFS policy regarding compatibility of CIF syntaxwith other domains