[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
From: James Hester <jamesrhester@gmail.com>
To: SIMON WESTRIP <simonwestrip@btinternet.com>; Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Wednesday, 8 July 2015, 10:44
Subject: Re: [ddlm-group] Semantics of whitespace-delimited values
James.
--
Reply to: [list | sender only]
Re: [ddlm-group] Semantics of whitespace-delimited values
- To: James Hester <jamesrhester@gmail.com>, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Semantics of whitespace-delimited values
- From: SIMON WESTRIP <simonwestrip@btinternet.com>
- Date: Wed, 8 Jul 2015 14:01:08 +0000 (UTC)
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=btinternet.com; s=s2048;t=1436364233; bh=zL4677WTQ51iAVfLmg3FVYIwScDSh7sSm/t1WCdvxvc=;h=Date:From:Reply-To:To:In-Reply-To:References:Subject:From:Subject;b=DZv2nECLsJgLr21h2f7lgLX1AI8Upgahqy0kDV5yzKol5kd2YkbbSe5skw5CKHpJsKmVvFvY238zdn0iNkRjWZG/kPK5y3INA1QvzM5CfaJ+F+/lVGL2wzo5h0NZcBNJJ0cSYGDeZVuuTSxzNGzXf/Zq+mGGJ+0PCTFEiYHf2YAe2SiewuWwa7tjmLb81h6jMHzZu5HhW66cBqtszSGjGOdOuHfl4XsDduLRd7ihpBAZuYhbUrucGsBRuJf/NFoePSdhU7+tgzVSHM3z3ciVAx/YtWl++NwMzm94HJbwti3RdDdFE6ccUkY1CmwuQEM9GscWEEe0WnXxclysjPy7ng==
- In-Reply-To: <CAM+dB2c00ZpU=N8=A1f0AafxQmOX6Rtr9tdijLFA=Dw2XOzLuA@mail.gmail.com>
- References: <CAM+dB2c00ZpU=N8=A1f0AafxQmOX6Rtr9tdijLFA=Dw2XOzLuA@mail.gmail.com>
Unfortunately but justifiably ORTEP3 for Windows is very unforgiving if numbers are delimited - just reports that it cant find the value. It uses CIFtbx to parse a CIF.
So this is at least one example of a (very) popular program that is strict with regard to the format of numbers for the data items it is interested in.
There may well be other applications that adhere strictly to the convention of presenting numbers without delimiters; after all its unlikely that there are many CIFs that contain core data in any other format, and in reality I dont expect this to change for CIF2.
So at this stage I'm still inclined to adopt the approach described by James for CIF2...
Cheers
Simon
From: James Hester <jamesrhester@gmail.com>
To: SIMON WESTRIP <simonwestrip@btinternet.com>; Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Wednesday, 8 July 2015, 10:44
Subject: Re: [ddlm-group] Semantics of whitespace-delimited values
Simon's quick survey is very useful.
I will confess that my PyCIFRW library completely ignores the delimiters and delivers every data value as a string. The calling application is responsible for converting values to numbers if so required (but I do provide routines to do this). If a dictionary is explicitly linked to a data block, PyCIFRW will attempt to return numeric values for dataitems that are specified in the dictionary as being of numeric type, regardless of the delimiters that were originally used. Python software that uses PyCIFRW (at least PyMol and a few others) will therefore behave in this way.
On output PyCIFRW does not delimit numeric values.I will confess that my PyCIFRW library completely ignores the delimiters and delivers every data value as a string. The calling application is responsible for converting values to numbers if so required (but I do provide routines to do this). If a dictionary is explicitly linked to a data block, PyCIFRW will attempt to return numeric values for dataitems that are specified in the dictionary as being of numeric type, regardless of the delimiters that were originally used. Python software that uses PyCIFRW (at least PyMol and a few others) will therefore behave in this way.
On 8 July 2015 at 03:09, SIMON WESTRIP <simonwestrip@btinternet.com> wrote:
OLEX2, JANA, OpenBabel and Avogadro also seem not to care that the numbers are delimited by apostrophes, while enCIFer correctly warns that the values are not correctly formatted.
From: SIMON WESTRIP <simonwestrip@btinternet.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Tuesday, 7 July 2015, 17:07
Subject: Re: [ddlm-group] Semantics of whitespace-delimited values
A quick test of some programs I have readily available with an 'invalid' CIF1.1 cif that contains delimited site coordinates:checkCIF (powered by PLATON) - issues alerts but nevertheless processes the CIF using the delimited values as numberspublCIF - warns that they should not have delimiters but reads the value as a number anyway (according to the dictionary)Jmol - renders models as expected.I'll test a few others in due course, but am pleased to see that these programs would not be scuppered by reading 'delimited numbers'. (NB obviously checkCIF/publCIF could fairly easily drop the alerts for CIF2, which are annoying in any case)CheersSimon
From: James Hester <jamesrhester@gmail.com>
To: ddlm-group <ddlm-group@iucr.org>
Sent: Tuesday, 7 July 2015, 15:17
Subject: [ddlm-group] Semantics of whitespace-delimited values
James.all the best,What do others think? If there is a body of CIF1 applications out there that have been designed to raise errors when values expected to be numeric are enclosed by delimiters, this proposal would represent a further annoying change from CIF1, and it would be good to have some idea of how many such applications there are. I speculate that many applications ignore the delimiter status, for reasons both of laziness, the authority of the dictionary definitions, and the philosophy of writing liberal parsers.(ii) A convention is encouraged for CIF writers whereby numeric values are not enclosed by delimiters.(i) The interpretation of a data value as numeric is determined solely by the dictionary with no regard to the particular delimiters used in the CIF file;What I would like to discuss for CIF2.0 is the following:I have no issue with question mark or period, as these are necessary for semantic completeness.Dear All,One issue that has not been discussed in the context of the CIF2 syntax is the special interpretation of whitespace-delimited values. In CIF1.1 as recorded in Volume G, a whitespace-delimited question mark and a whitespace-delimited period have a special interpretation as "unknown" and "default/not applicable/null" respectively. Furthermore, only a whitespace-delimited value matching a specified syntax (which includes optional appended esd values) may be interpreted as a numeric value, and it would strictly speaking be a semantic error for a CIF processor to interpret as a number a numeric value enclosed in delimiters.(iii) The precise construction of numeric values is moved into the DDLm attribute dictionary.The advantage of this simpler scheme is a clean separation between syntax and human-relevant semantics. The only CIF applications that can have a use for the CIF1 scheme are those that are written without reference to a dictionary, most obviously pretty-printers that might want to tabulate numbers by lining up decimal points instead of left-justifying. Even if such formatting applications get it wrong, they will not change the meaning of the file and so I would view point (ii) as sufficient support for such applications. Conversely, any application that wishes to operate on a number as opposed to operating on the textual representation of the number will of necessity need to know what this number means and will therefore be written with reference to a dictionary, making it unnecessary to signal "numericness" using whitespace deliimited datavalues.
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- Re: [ddlm-group] Semantics of whitespace-delimited values (James Hester)
- Prev by Date: Re: [ddlm-group] Semantics of whitespace-delimited values
- Next by Date: Re: [ddlm-group] Semantics of whitespace-delimited values
- Prev by thread: Re: [ddlm-group] Semantics of whitespace-delimited values
- Next by thread: Re: [ddlm-group] Semantics of whitespace-delimited values
- Index(es):