Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Data names with primitive type info?

  • To: "Discussion list of the IUCr Committee for the Maintenance of the CIFStandard (COMCIFS)" <comcifs@iucr.org>
  • Subject: Data names with primitive type info?
  • From: Joe Krahn <krahn@niehs.nih.gov>
  • Date: Wed, 14 Mar 2007 10:49:39 -0400
I noticed that CIF has a rule that numeric values are not allowed to
have quotes if the value is intended to be stored in memory as a number.
This is an anomaly, considering that CIF is a text-data format. Given
that CIF is intended for scientific data, I think it would be useful to
add a more concrete concept of numeric types. This information is
currently limited to the dictionary, but I think it is advantageous to
keep the data fairly well self-described, as originally envisioned with
STAR, and to optimize it specifically for scientific data, otherwise it
would be more efficient to switch to some form of XML.

The type of addition I think would be useful is a primitive data-type
flag included in the data name, to distinguish string, float and integer
types, and presence of a numeric esd. This could be done with a leading
dollar-sign, which is the only name exclusion, and only for STAR
compatibility. For example, _$I$dataname could be an integer flag.

The current plan is leaning more towards requiring a dictionary (schema)
to interpret the data. I think it is a big benefit to keep things
self-described at least in terms of parsing data into memory, but I am
new to thinking about CIF seriously. What do other people think?

Email discussion is rather slow, so I'm hoping for a more effective
discussion at the ACA meeting. Meanwhile, I wanted to get some of my
ideas out for consideration.


Reply to: [list | sender only]