Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CIF-JSON draft 2017-05-15

On Tue, May 16, 2017 at 1:00 AM, James Hester <jamesrhester@gmail.com> wrote:

On 16 May 2017 at 14:30, Robert Hanson <hansonr@stolaf.edu> wrote:
Two last comments:

1. This comment in the CIF1 spec:

13. The base CIF specification distinguishes between character and numeric values (see paragraph 15 of the document Common semantic features). Particular CIF applications may make more finely-grained distinctions within these types. The paragraphs immediately above have the corollary that a data value such as 12 that appears within a CIF may be quoted (e.g. '12') if, and only if it is to be interpreted and stored in computer memory as a character string and not a numeric value. For example '12' might legitimately appear as a label for an atomic site, where another alphabetic or alphanumeric string such as 'C12' is also acceptable; but it may not legitimately be used to represent an integer quantity twelve.

​suggests that CIF-JSON absolutely cannot be round-tripped back to CIF format.

It can be round-tripped, if you have access to dictionary definitions for every data name appearing in every data block of the JSON object.  This is not true in general, but is probably true e.g. for a lot of the curated CIFs that are provided by structural databases. 

Ah, right! Because then you know if a value is supposed to be  a number or not. I get it.

2. One thing we are missing here is a magic number. I know, transport should not need that. But before you know it, someone is going to put a CIF-JSON data stream into a file and then pass that file to another program such as Jmol, expecting the program to know what it is reading. This has always been a nightmare for Jmol. The #\#CIF_2.0 header takes care of that for CIF. In the past all too often people have created formats that don't quickly identify themselves, causing all sorts of headaches. What would you think of this?


This way a reader could  always know that it is reading CIF-JSON data immediately with just a 10-byte stream read. It would allow file saving and retrieval of the data.

Interesting idea.  We could change the 'Metadata' top-level JSON name to 'CIF-JSON' and recommend that this name appears first in the serialised stream, where possible. I don't think it is possible to make this stronger and mandate an order for keys in JSON objects, however.

It's always possible to do that. If it seems valuable, I suggest requiring it. Like I said, someone is going to just save this thing, and then I am going to get a request to create a reader for it in Jmol. Or even with out that, the Jmol load command could be directed to a server of CIF-JSON and have to determine on the fly what it is reading so it can assign a reader automatically.

And then I have to create a resolver step to identify it among several dozen other formats. :( I guess Jmol could read the entire JSON string in prior to making a decision, but that is not the philosophy of Jmol. Right now there are only a couple of obscure formats that require coercing Jmol to use an explicit reader; all others are figured out automatically from the header area of the data stream.

Without a clear header, it would be a nightmare to tease out whether or not this thing is a CIF-JSON stream or not, particularly because the Metadata item could be anywhere in the file. I am so tired of formats that presume: "You wouldn't be reading this if you didn't know exactly what it was." Such designers think only of their own product and do not consider the broader context of its use.

cif-developers mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.