[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
--
Reply to: [list | sender only]
Re: CIF-JSON draft 2017-05-08
- Subject: Re: CIF-JSON draft 2017-05-08
- From: James Hester <jamesrhester@xxxxxxxxx>
- Date: Thu, 11 May 2017 17:28:19 +1000
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=gmail.com; s=20161025;h=mime-version:in-reply-to:references:from:date:message-id:subject:to;bh=PdT9NKovrJnnERwGHhmDNoUEFrbIrRqxt3/pb8bmLZg=;b=RKW8FJZU1Zpknc/PyvZwq7kIwcSSbDFrFaEq28k9Fi4nKTihtQd2SE/oBXcLIZZtgV4G26W43vmyWlneU7fDL3j8OhLC7m1gyBphb+7irxxH0KhfumfgiL676yYim/46B1J2U7hUbEbKKoK2bjOmmncNT2VXmIkbKbcHQfOnTychZZkmqcRsiiDJpXKwBwLDow5o6bbae7rjh2Yy1Ha4MS82TcS+/tuF7rF2UhMgvJTUFj8lMBs0jpJz6sb1zg9FfFF3TTarDTaBMPMyk+Yqcb/n3gfJU7kp5XJAPJ+Lo7u1kjGX1OlvL52vZklQOpGWlD+e/R42W4nVHFRYXYrRKw==
- In-Reply-To: <CACaHzQXxiB=eACzuxT+bqmYyCmcVa-ZABkKOLzm3C5-53xYFvA@mail.gmail.com>
- References: <CAM+dB2cwoCG6LhPUePRup_hQtM9mXqwL4tULTPf-WGwJGtKrOA@mail.gmail.com><MWHPR04MB051220BFF5C7093CD86702CCE0EE0@MWHPR04MB0512.namprd04.prod.outlook.com><CAF_YUvW=i0XjfzmgA=m03a=X4Y03+FfH8_TAZ2dNPVAhCmuy5w@mail.gmail.com><MWHPR04MB0512E4FA07327A5E9103106CE0EE0@MWHPR04MB0512.namprd04.prod.outlook.com><CACaHzQVez_WUma3z2mAXYroiJtEDtxaUAS-n0Noz-R=nPBTF+w@mail.gmail.com><MWHPR04MB05120B2E2061754ABEBED31AE0EE0@MWHPR04MB0512.namprd04.prod.outlook.com><CACaHzQXKMAqo97iZ6ZdHP_yFhRVKiEBsyMbHZxkBxKwPPcc7LA@mail.gmail.com><MWHPR04MB0512A0D165506326550ED871E0EF0@MWHPR04MB0512.namprd04.prod.outlook.com><CAM+dB2fyQe2KSNRv-+q=Sq3NFWSCK72kBTsUiiJfeSa=PQfvNQ@mail.gmail.com><CACaHzQXxiB=eACzuxT+bqmYyCmcVa-ZABkKOLzm3C5-53xYFvA@mail.gmail.com>
The problem with requiring that CIF numbers = JSON numbers is that it is not generally possible for a CIF->JSON parser to know when a CIF value is a number instead of simply a non-delimited string that looks like a number. The only way to get this right is to have access to CIF dictionar(ies) that contain all the datanames appearing in the datablock, which is both a considerable overhead (especially for 7500 definitions in pdbx/mmCIF) and not foolproof due to local datanames.
The consumer of the JSON, on the other hand, will know which of the datanames that it cares about are numeric and perform the conversion (as per CIF rules, I don't know if the C++17 standard is relevant here). The optimisation that I see is that, having performed this conversion once, it may be nice to be able to pass the JSON on to another script written by a different author and preserve the conversion work that has been done. On 11 May 2017 at 17:16, Marcin Wojdyr <wojdyr@gmail.com> wrote:
> As far as numbers go, it is clear that representation of numbers as strings
> should be allowed in order to support translation from CIF files.
Translation from CIF is *easier* when the numbers are written as
strings, because one doesn't even need to parse numbers. But,
comparing with complexity of parsing the CIF format, parsing numbs and
writing them as possibly two separate numbers is not that difficult.
The downside of the quoted representation in JSON is of course that
the recipient of such JSON file, after presumably using a third-party
JSON parser, needs to finish the parsing himself.
John reasonably argued that a single representation is better than
two. After thinking about it I'd agree. But I'd not agree with the
choice. Parsing numbers on the reading side should be done entirely by
a JSON parser. Usually there are more consumers of file formats than
producers, and the extra complexity is preferable in the CIF->JSON
step rather than when working with JSON.
If anyone thinks that parsing numbs is trivial and no extra complexity
is involved, I propose that someone familiar with C or C++ writes here
a (thread-safe) function that can parse the numb format. As a hint:
functions to parse numbers in a locale-independent way are available
only in C++17 which is not widely adopted yet.
Marcin
_______________________________________________
cif-developers mailing list
cif-developers@iucr.org
http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif- developers
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ cif-developers mailing list cif-developers@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers
Reply to: [list | sender only]
- Follow-Ups:
- Re: CIF-JSON draft 2017-05-08 (Marcin Wojdyr)
- References:
- CIF-JSON draft 2017-05-08 (James Hester)
- RE: CIF-JSON draft 2017-05-08 (Bollinger, John C)
- Re: CIF-JSON draft 2017-05-08 (Robert Hanson)
- RE: CIF-JSON draft 2017-05-08 (Bollinger, John C)
- Re: CIF-JSON draft 2017-05-08 (Marcin Wojdyr)
- RE: CIF-JSON draft 2017-05-08 (Bollinger, John C)
- Re: CIF-JSON draft 2017-05-08 (Marcin Wojdyr)
- RE: CIF-JSON draft 2017-05-08 (Bollinger, John C)
- Re: CIF-JSON draft 2017-05-08 (James Hester)
- Re: CIF-JSON draft 2017-05-08 (Marcin Wojdyr)
- Prev by Date: Re: CIF-JSON draft 2017-05-08
- Next by Date: Re: CIF-JSON draft 2017-05-08
- Prev by thread: Re: CIF-JSON draft 2017-05-08
- Next by thread: Re: CIF-JSON draft 2017-05-08
- Index(es):