Re: mmJSON (CIF-JSON-like format from PDBj)

  Subject: Re: mmJSON (CIF-JSON-like format from PDBj)
  From: James Hester <jamesrhester@xxxxxxxxx>
  Date: Thu, 29 Jun 2017 13:05:01 +1000
It is interesting to see the different choices made here.  As far as I can tell (I couldn't find any formal spec) the values for each data name are put into an array, which are then attached to a data names in an associative array (much like CIF-JSON). The key difference is that these per-loop associative arrays are then attached to category names in a higher-level associative array.

The github site for this project (https://github.com/gjbekker/cif-parsers) has an example:
  "data_PDBID": {
    "pdbx_category1": {
      "field1": [1, 2, 3, 4],
      "field2": ["one", "two", "three", "four"],
      "field3": [1.0, 2.0, 3.0, 4.0]
    "pdbx_category2": {
      "field1": [1, 2],
      "field2": ["one", "two"],
      "field3": [1.0, 2.0]
    "pdbx_category3": {
      "field1": [1],
      "field2": ["one"],
      "field3": [1.0]

What is also interesting is that the parser that produces this (found at the above website) appears
to optionally use the pdbx/mmCIF dictionary (in json form) and apply all the type information found

there to produce the correctly-typed arrays that appear in the above example. As far as I know
the wwPDB enforces a strict use of quotes to surround any non-numerical value so such

double-checking is not strictly required.

On 28 June 2017 at 20:46, Marcin Wojdyr <wojdyr@gmail.com> wrote:
Hi All,
I just came across mmJSON format:
and just wanted to share it. It's similar to CIF-JSON but tailored for
mmCIF files.

The paper about it says[1]:
An analysis showed that the compressed mmJSON is on average
approximately 33 or 56 % smaller than a compressed mmCIF or PDBML
formatted file, respectively, making it more suitable for web

[1] https://jcheminf.springeropen.com/articles/10.1186/s13321-016-0155-1
