[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF-2 changes

Re datanames: remember that we have made a more or less explicit
promise that current datanames can be used without change in CIF2
files, therefore datanames with square brackets will be legitimate in
CIF2 data files.  I don't recall any discussion where we agreed to
work around this by some sort of reprocessing.

And the only CIF2 parsers that will fail when they see a square
bracket in a dataname are those that are (incorrectly) prepared to
accept no spaces between dataname and datavalue.  So I repeat: the
only reason we have moved away from square brackets as list delimiters
is so that in the specific case that a space is missing between a
dataname and a datavalue the parser can continue.  I see no other
justification.

On Mon, Nov 9, 2009 at 12:55 PM, Nick Spadaccini <nick@csse.uwa.edu.au> wrote:
> James and Joe are correct on this point. The dropping of [] was for reasons
> of ease to older CIF1 files. BUT absolutely it introduces problems also,
> while trying to ease other parts of the parsing process. I don't know if my
> thinking was mature enough on this issue when I suggested the change.
>
> Let me make my position clear. I WOULD MUCH PREFER to have lists defined by
> square brackets and associative arrays by curly brackets. In this way the
> parser can determine at the purely lexical level that it is in a list or an
> associative array on reading the first [ or { when it is in the context.
>
> My thinking for making both delimited by { came from the fact that there are
> existing datanames with embedded [ and a CIF2 parser will take this to be
> the beginning of a list. To simplify this parsing I suggested removing []
> from the set of disallowed characters. Joe K quite correctly states that in
> a CIF2 file there can be no [] in a dataname so it will be safe.
>
> After this thread there was discussion on a leading comment identifying a
> file as CIF2. IF THIS IS present the dilemma is removed. At the first line
> of the parse we know whether to drop in to the CIF1 or the CIF2 lexical
> rules of our parser. BUT I am NOT sure if we MANDATED this first line
> comment. An alternative is to (essentially) require a re-parse to determine
> whether the file is CIF1 or CIF2. Such a pre-parser cannot assume either
> rule set, but go through the first X lines, character by character until it
> can confidently conclude it is one or the other.
>
> Either way these approaches remove the problem of CIF1 from the syntactic
> specification of CIF2 (again something I would prefer to do).
>
> We should vote on this since it will make the issue concrete. We can employ
> square brackets to identify lists if we abstract away the issue of existing
> CIF1 datanames to a higher level. Which is a moot point anyway because there
> are other aspects of CIF1 that break CIF2 parsers that we need to deal with.
>
> Finally employing [] makes it much easier to cast everything into Python
> (though this is just a convenience and not a critical reason for employing
> them).
>
> And yes, tuples have been dropped from the CIF2 data types. Immutability of
> a tuple is an implementation issue and not a representation issue. In terms
> of representation it makes no difference to call a CIF object a tuple or a
> list.


-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]