Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Important CIF items for discussion

On Fri, Jul 18, 2008 at 3:12 AM, Joe Krahn <krahn@niehs.nih.gov> wrote:

(regarding ways of embedding dictionaries in data files)

> Another alternative is to just use MIME encapsulation, already defined
> for Binary CIF. As it is implemented there, the data block is all
> base-64 encoded, which contains no semicolons. But, MIME can also
> encapsulate unencoded text, by defining the end-of-data marker. This
> requires the low-level parser to understand the MIME format.

This latter is the killer - this would require understanding of a
whole other standard at the parsing level, change the CIF standard
significantly at the syntactical level, and break all current parsers
(see comments on save frames at the end).

> I should also mention that the newer STAR format has a bracketed list
> format that includes backslash escape sequences to allow contained items
> to have a bracket. If this is adopted by CIF, that may be enough of a
> precedent to support backslash-escapes for beginning-of-line semicolons
> as well.

I hadn't heard about the <backslash><bracket> escape sequence for STAR
- any idea where information about that might be found?  In general,
enclosing any element of the bracketed list inside inverted commas or
apostrophes should be sufficient to prevent the STAR parser from
interpreting the bracket, so the backslash notation would appear to be
unnecessary.  In general I don't think that COMCIFS would be too
critical of a new escape sequence if there was a perceived benefit.

> Here is a snippet of my idea for storing dictionary data within a
> datablock, using a SAVE frame, so that it is part of the actual CIF
> syntax rather than using a text block, which means the data has to be
> double-parsed.
> By using save frames, it is easy to avoid conflicts, because CIF
> currently restricts their use to dictionaries. I would also support
> nested SAVE frames so that the whole dictionary syntax can be fit inside
> of one parent SAVE frame, rather than having to split it.

I agree that save frames are an excellent solution to the
problem...except that most current CIF parsers are not able to parse
save frames in data files, so by allowing data files to contain save
frames you will immediately render most CIF-reading programs broken
(at least in spirit).  The rarity of user-defined dictionaries coupled
with the rarity that that dictionary will be useful to a given program
anyway means that the cost of breaking all those programs is not worth

best wishes,
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

Reply to: [list | sender only]