[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF-2 changes

Dear Colleagues,

   I am coping with the rapid ebb and flow of what delimiters will be
used for CIF2 by providing controls for both CIFtbx_4 and CBFlib_9
to turn on or off sensitivity to (), [] and {} delimiters separately,
so I can deal with whatever the final decision is by changing the
defaults, ...

   BUT we really do have to settle this.  Ignoring the comma, quote and 
colon issues, Since 2007 we have now gone through use of a of (), [], and 
{} as reserved delimiters, use of just () and {}, and use of just {}.  It 
is time to simply make a firm and final decision.

   We have dictionaries to write and code to implement.  I have already
delayed delivery of software to deal with this, and I really don't like
delaying delivery of software.

   The use of the [] in CIF 1 dictionary category name is well-established.
I prefer to minimize the impact on handling existing dictionaries in a
CIF2 context and to stick strictly to {} as the only brackets in the
set of reserved delimiters and to allow both () and [] in non-delimited
strings and data names.  So let us have another straw vote:

   Option 1:  {} will be reserved, but [] and () will not be reserved; or
   Option 2:  {} and [] will be reserved, but () will not be reserved; or
   Option 3:  {} and () will be reserved, but [] will not be reserved; or
   Option 4:  {}, () and [] will all be reserved

I suggest preferences voting noting if any are fatal to anyone.

My preferences are in decreasing order:  1 then 3 and 2 then 4, but, as
long as we settle this for once and for all, I can live with any of them.


  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769


On Mon, 9 Nov 2009, James Hester wrote:

> Re datanames: remember that we have made a more or less explicit
> promise that current datanames can be used without change in CIF2
> files, therefore datanames with square brackets will be legitimate in
> CIF2 data files.  I don't recall any discussion where we agreed to
> work around this by some sort of reprocessing.
> And the only CIF2 parsers that will fail when they see a square
> bracket in a dataname are those that are (incorrectly) prepared to
> accept no spaces between dataname and datavalue.  So I repeat: the
> only reason we have moved away from square brackets as list delimiters
> is so that in the specific case that a space is missing between a
> dataname and a datavalue the parser can continue.  I see no other
> justification.
> On Mon, Nov 9, 2009 at 12:55 PM, Nick Spadaccini <nick@csse.uwa.edu.au> wrote:
>> James and Joe are correct on this point. The dropping of [] was for reasons
>> of ease to older CIF1 files. BUT absolutely it introduces problems also,
>> while trying to ease other parts of the parsing process. I don't know if my
>> thinking was mature enough on this issue when I suggested the change.
>> Let me make my position clear. I WOULD MUCH PREFER to have lists defined by
>> square brackets and associative arrays by curly brackets. In this way the
>> parser can determine at the purely lexical level that it is in a list or an
>> associative array on reading the first [ or { when it is in the context.
>> My thinking for making both delimited by { came from the fact that there are
>> existing datanames with embedded [ and a CIF2 parser will take this to be
>> the beginning of a list. To simplify this parsing I suggested removing []
>> from the set of disallowed characters. Joe K quite correctly states that in
>> a CIF2 file there can be no [] in a dataname so it will be safe.
>> After this thread there was discussion on a leading comment identifying a
>> file as CIF2. IF THIS IS present the dilemma is removed. At the first line
>> of the parse we know whether to drop in to the CIF1 or the CIF2 lexical
>> rules of our parser. BUT I am NOT sure if we MANDATED this first line
>> comment. An alternative is to (essentially) require a re-parse to determine
>> whether the file is CIF1 or CIF2. Such a pre-parser cannot assume either
>> rule set, but go through the first X lines, character by character until it
>> can confidently conclude it is one or the other.
>> Either way these approaches remove the problem of CIF1 from the syntactic
>> specification of CIF2 (again something I would prefer to do).
>> We should vote on this since it will make the issue concrete. We can employ
>> square brackets to identify lists if we abstract away the issue of existing
>> CIF1 datanames to a higher level. Which is a moot point anyway because there
>> are other aspects of CIF1 that break CIF2 parsers that we need to deal with.
>> Finally employing [] makes it much easier to cast everything into Python
>> (though this is just a convenience and not a critical reason for employing
>> them).
>> And yes, tuples have been dropped from the CIF2 data types. Immutability of
>> a tuple is an implementation issue and not a representation issue. In terms
>> of representation it makes no difference to call a CIF object a tuple or a
>> list.
> -- 
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
ddlm-group mailing list

Reply to: [list | sender only]