Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: CIF-JSON new draft

On Wednesday, May 03, 2017 5:42 PM, Robert Hanson wrote:
> On Wed, May 3, 2017 at 4:03 PM, Bollinger, John C <John.Bollinger@stjude.org> wrote:
>> Dear CIF Developers,
>>
>> I additionally think that the current version of the specification is too lax about use of the "loop tags" item.  If the CIF version is unspecified among the metadata or if it is specified with value 2.0, and if the data block being represented contains either at least one loop or at least one unlooped item whose value is a CIF2 list, then the meaning of some values is ambiguous without the "loop tags".  It is undesirable to allow such ambiguity.
>
>
> John, I don't see that. Can you give a concrete example?

Sure.  Consider this JSON:

{
  "block": {
    "_xyz": [ 0.1, 0.2, 0.3 ]
  }
}

It appears to me that it corresponds both to

#\#CIF_2.0
data_block
_xyz [ 0.1 0.2 0.3 ]
# end of CIF

and to

#\#CIF_2.0
data_block
loop_
_xyz
 0.1
 0.2
 0.3
# end of CIF

Of course, the latter could trivially be converted to CIF 1.1, with that result also corresponding to the given JSON.

On the other hand, only the latter corresponds to this JSON:

{
  "block": {
    "_xyz": [ 0.1, 0.2, 0.3 ],
    "loop tags": [["_xyz"]]
  }
}

and only the former corresponds to this JSON:

{
  "block": {
    "_xyz": [ 0.1, 0.2, 0.3 ],
    "loop tags": []
  }
}

The problem arises from the similarity between items 5.v and 5.vii in the draft spec: CIF 2.0 list values are presented as JSON lists / arrays, and the multiple values of a looped item are also presented as JSON lists / arrays.  As I read the draft, however, the values of unlooped items are presented bare, as opposed to, for example, as single-element arrays.  Therefore, some form of metadata is required to distinguish whether an array presented as an item's value represents a single CIF list value or multiple distinct values taken by the item in a loop.  The appropriate "loop tags" field can provide the needed metadata if it is present, given the requirement that if it appears at all then it must describe all loops in the data block.  I had supposed that the ability to use it for that purpose was the reason for that constraint.

However, inasmuch as CIF does not make an inherent semantic distinction between items presented as scalars and the same items presented in a single-packet loop (and mmCIF in particular denies any significance to such differences), an alternative to requiring that loop tags be provided would be to present every item as if it were looped -- i.e. regardless of whether an item is presented syntactically in a loop in the CIF format, its one or many values are presented in an array in CIF-JSON.  In that case, the "loop tags" field would not need to carry the burden, and the constraints on it could even be relaxed a bit.  Such a representation is consistent with CIF's underlying data model as I conceive it (http://forums.iucr.org/viewtopic.php?f=27&t=77).


John


________________________________

Email Disclaimer: www.stjude.org/emaildisclaimer
Consultation Disclaimer: www.stjude.org/consultationdisclaimer
_______________________________________________cif-developers mailing listcif-developers@iucr.orghttp://mailman.iucr.org/cgi-bin/mailman/listinfo/cif-developers

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.