[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
(1) No allowance needs to be made for expressing CIF numbers as JSON numbers, and therefore no "uncertainties" object is necessary
(2) To round-trip a CIF, information about which datavalues were quoted must be preserved
loop_
Reply to: [list | sender only]
Draft JSON specification, round 2
- Subject: Draft JSON specification, round 2
- From: James Hester <jamesrhester@xxxxxxxxx>
- Date: Wed, 19 Apr 2017 16:32:01 +1000
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:from:date:message-id:subject:to;bh=Is05ru1eFaQf5tNHbiirHtPaCmNIIRXZUm1OQypD38M=;b=hjqvUhDX754RGEv3ZuT6sYe91YpQDKl6xVcKXo3HIxjDOnG/7isVongr+gymtxvndlGaHAIKN6InRStDhb9XKxz6h2G8FIXz2yq+BeXuiIi9oPU0fjaOj2GMgkqCV/igRSnju2jOrTSr41koHMZ4c1kac2ZM3AilWT0FCc2ZJY1Jv9orGvdm5BTZ+2yeWKk9FHJtg26P/xQsksko8C+kbm1rn5YlxLda5kz9EmJ0TjQAHDjn/DFMjh8CE9FgRLsBLZP2Xj3mZEf7MY91UZe5t/8UF/ng6tmKyzcxl/ygZ/iz5GM/GHb9kChR7DFjdT2Vo8sy3rC1TCu7GA+vHDuUtg==
Dear CIF developers,
Reviewing last weeks' discussion, there is a clear bifurcation in the approaches to CIF-JSON that have arisen in practice: (1) the 'high fidelity' approach of COD-JSON (2) the 'low overhead' approach of JMol and Marcin. This suggests that a single JSON is unlikely to satisfy all users. Given that COD-JSON is available, implemented and complete, with open-source tools available, I propose we continue to explore here the 'low overhead' approach to see whether it can be brought to a similar state.
First let me summarise the points where I see consensus arising out of the discussions last week:(3) Using an escape mechanism for CIF '?' is undesirable, instead \uFFFF or \u0001 would be suitable
So: I propose changing the draft so that (1) all datavalues are strings (2) an unquoted question mark is replaced by \uFFFF (3) the 'uncertainties' object is removed. The resulting JSON would have the following properties:
(1) It would not be possible to (re)create conformant input CIFs unless dictionary definitions are available for all datanames
(2) CIF-JSON readers must parse numeric values as needed
(3) CIF-JSON writers must explicitly format newly-inserted numeric values as JSON strings
(4) A CIF containing numeric datavalues in delimited strings would be processed through a JSON application without detection of non-conformity. For example, JSON generated from the fragment
loop_
_atom_site.fract_x
_atom_site.fract_y
_atom_site.fract_z
"0.0" "0.5" "0.1234(4)"
"0.0" "0.0" "0.7500"
"0.0" "0.5" "0.1234(4)"
"0.0" "0.0" "0.7500"
would be processed without error by a JSON application that plots atomic positions (e.g. JMol).
Point (1) should be spelled out in all documentation and COD-JSON suggested as an alternative
Point (2) imposes extra work compared to parsing the values once before passing around the resulting object, although as Bob points out, the difference between the built-in JSON parser parsing a non-delimited string and your custom parser parsing a number with optional uncertainty is slight
Point (3) is very little extra work as formatting of numbers with uncertainties is not a typical JSON library operation
Point (4) is not a problem for any software that manipulates specific datanames, as they must know what datatype to expect and thus will parse numbers as needed. Any generic software that does rely on the meaning of specific datanames (e.g. a pretty printer) may have issues, although a suitable example doesn't come to mind.
Point (1) should be spelled out in all documentation and COD-JSON suggested as an alternative
Point (2) imposes extra work compared to parsing the values once before passing around the resulting object, although as Bob points out, the difference between the built-in JSON parser parsing a non-delimited string and your custom parser parsing a number with optional uncertainty is slight
Point (3) is very little extra work as formatting of numbers with uncertainties is not a typical JSON library operation
Point (4) is not a problem for any software that manipulates specific datanames, as they must know what datatype to expect and thus will parse numbers as needed. Any generic software that does rely on the meaning of specific datanames (e.g. a pretty printer) may have issues, although a suitable example doesn't come to mind.
If the consensus is that we would actually like to keep delimiter information, we can add a new object to the CIF JSON: "might-be-a-number" whose value is a list of datanames that were not delimited in the CIF file *and* match the regexp for a number. If there is a mixture in a loop column, that is not a number.
Thoughts?
James.
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ cif-developers mailing list cif-developers@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers
Reply to: [list | sender only]
- Follow-Ups:
- RE: Draft JSON specification, round 2 (Bollinger, John C)
- Re: Draft JSON specification, round 2 (Robert Hanson)
- Prev by Date: RE: Draft JSON specification for CIF
- Next by Date: Re: Draft JSON specification, round 2
- Prev by thread: Re: Treatment of Greek characters in CIF2
- Next by thread: Re: Draft JSON specification, round 2
- Index(es):