[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: Draft JSON specification for CIF
- Subject: Re: Draft JSON specification for CIF
- From: Robert Hanson <hansonr@xxxxxxxxxx>
- Date: Wed, 12 Apr 2017 22:29:08 -0500
- DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=stolaf.edu; s=stolaf;h=mime-version:in-reply-to:references:from:date:message-id:subject:to;bh=PQOr334DmcseF2nIVEYHXGKFbAu/W8DoVMtWPmwHDzo=;b=l8VSizEhLzwgzSg8IPJOozJxT+w1tQra6F49wRJlSh473jG4mMomWn5mnapeo4EAH0HBSpkj4mDyLvW7z/F7gx6pqEdufSR5hhLvKkXQmv8jfdsh0xSh1ZbHu+8fz0h34v1fjszx5mL6qnCy1+QF4U7rFuWMbScmUTXityZPcJQ=
- In-Reply-To: <CAM+dB2fNDVbGGJHZo0ivTe1QZc3zMCGUgh2PgFi7T5zsyCCq_A@mail.gmail.com>
- References: <CAM+dB2fszww=4A_w6evqg=5O9KKLnujajmg_SPSX=hCRQiBPtg@mail.gmail.com><CAF_YUvWYYDuDRvuDN1k=T0Ym8M3enEBPr45KNWF9m1Wng21rEw@mail.gmail.com><CAM+dB2fNDVbGGJHZo0ivTe1QZc3zMCGUgh2PgFi7T5zsyCCq_A@mail.gmail.com>
I forgot to say thank you for thinking hard about how to do this. Lots of good ideas there.
On Wed, Apr 12, 2017 at 7:28 PM, James Hester <jamesrhester@gmail.com> wrote:In reply to Bob:(2) ["dataname.a"] vs "dataname.a" . I don't understand why JSON would require that a string containing a period should be put inside square brackets
(1) CIF2 header: Sorry, my mistake, there should be a CIF2 header for the example CIF file (it is required). I have corrected the draft at Github.
Nothing requires it. When people use JSON they typically transform it into JavaScript or other-language arrays. In that case we would have to use xxx["dataname.a"] to reference this field, because xxx.dataname.a would reference an element of xxx.dataname, which does not exist. It just seems odd to me.
(3) Substituting "_" for ".". The proposed standard simply preserves the dataname as it appears in the CIF file (whether it contains "." or not). Strictly speaking, any equivalence between datanames can only be determined by consulting a CIF dictionary. Although COMCIFS naming policy leads to some useful shortcuts, they are not part of the CIF syntax standard so should not be included in the JSON standard.
I guess I was thinking about mmCIF and its structured category.item keys such as _atom_site.auth_atom_id and wondered if one might want to have this JSON creating an _atom_site object with an auth_atom_id key/value pair when there is a period in the name. But then, again, we have CIF keys such as _atom_site.aniso_B[2][3]. As I think about it now, that's probably not worth the trouble, since the numbers in mmCIF keys like that are 1-based, not 0-based, and that would be a headache. I guess it is what it is. And, besides, mmCIF is not CIF2, is it....
(4) Agree that a test implementation that can digest a range of valid CIF files would be necessary before acceptance. Once we have some consensus I'm happy to produce something in Python (although one using the CIFAPI would be great as well).
I guess it's no problem that a JSON reader might completely scramble the order of tags in a CIF file, right?
What about capitalization? While it is true that CIF data names are case insensitive -- and maybe for that reason -- it seems to me that normalizing that to all lower case would be a good move. It would make reading these immensely easier and faster. I think if I wrote a reader, I would have to go through all the keys and change them to lower case prior to doing anything with it. Otherwise I couldn't reliably check for a key.
One of the trickier aspects of reading CIF files is that whenever you have a single object -- say, a single atom, or a single operator -- you can represent it as a loop or individually. This will lead to complications in JSON as well, because for many fields one will still have to check each time for the existence of an array type or a simple value. I can see how the "loop tags" idea might help there. Personally, I would prefer "loop_tags" not "loop tags" because, again, that's just a pain to have to always use ["xxxx"] to reference elements with keys like that.
What about capitalization? While it is true that CIF data names are case insensitive -- and maybe for that reason -- it seems to me that normalizing that to all lower case would be a good move. It would make reading these immensely easier and faster. I think if I wrote a reader, I would have to go through all the keys and change them to lower case prior to doing anything with it. Otherwise I couldn't reliably check for a key.
One of the trickier aspects of reading CIF files is that whenever you have a single object -- say, a single atom, or a single operator -- you can represent it as a loop or individually. This will lead to complications in JSON as well, because for many fields one will still have to check each time for the existence of an array type or a simple value. I can see how the "loop tags" idea might help there. Personally, I would prefer "loop_tags" not "loop tags" because, again, that's just a pain to have to always use ["xxxx"] to reference elements with keys like that.
Bob
_______________________________________________ cif-developers mailing list cif-developers@iucr.org http://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers
Reply to: [list | sender only]
- Follow-Ups:
- Re: Draft JSON specification for CIF (Marcin Wojdyr)
- References:
- Draft JSON specification for CIF (James Hester)
- Re: Draft JSON specification for CIF (Robert Hanson)
- Re: Draft JSON specification for CIF (James Hester)
- Prev by Date: Re: Draft JSON specification for CIF
- Next by Date: Re: Draft JSON specification for CIF
- Prev by thread: Re: Draft JSON specification for CIF
- Next by thread: Re: Draft JSON specification for CIF
- Index(es):