[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: [ddlm-group] CIF-2 changes

To: Group finalising DDLm and associated dictionaries <[email protected]>
Subject: Re: [ddlm-group] CIF-2 changes
From: Nick Spadaccini <[email protected]>
Date: Mon, 09 Nov 2009 09:55:04 +0800
Authentication-Results: postfix;
In-Reply-To: <[email protected]>
James and Joe are correct on this point. The dropping of [] was for reasons
of ease to older CIF1 files. BUT absolutely it introduces problems also,
while trying to ease other parts of the parsing process. I don't know if my
thinking was mature enough on this issue when I suggested the change.

Let me make my position clear. I WOULD MUCH PREFER to have lists defined by
square brackets and associative arrays by curly brackets. In this way the
parser can determine at the purely lexical level that it is in a list or an
associative array on reading the first [ or { when it is in the context.

My thinking for making both delimited by { came from the fact that there are
existing datanames with embedded [ and a CIF2 parser will take this to be
the beginning of a list. To simplify this parsing I suggested removing []
from the set of disallowed characters. Joe K quite correctly states that in
a CIF2 file there can be no [] in a dataname so it will be safe.

After this thread there was discussion on a leading comment identifying a
file as CIF2. IF THIS IS present the dilemma is removed. At the first line
of the parse we know whether to drop in to the CIF1 or the CIF2 lexical
rules of our parser. BUT I am NOT sure if we MANDATED this first line
comment. An alternative is to (essentially) require a re-parse to determine
whether the file is CIF1 or CIF2. Such a pre-parser cannot assume either
rule set, but go through the first X lines, character by character until it
can confidently conclude it is one or the other.

Either way these approaches remove the problem of CIF1 from the syntactic
specification of CIF2 (again something I would prefer to do).

We should vote on this since it will make the issue concrete. We can employ
square brackets to identify lists if we abstract away the issue of existing
CIF1 datanames to a higher level. Which is a moot point anyway because there
are other aspects of CIF1 that break CIF2 parsers that we need to deal with.

Finally employing [] makes it much easier to cast everything into Python
(though this is just a convenience and not a critical reason for employing
them).

And yes, tuples have been dropped from the CIF2 data types. Immutability of
a tuple is an implementation issue and not a representation issue. In terms
of representation it makes no difference to call a CIF object a tuple or a
list.


On 30/10/09 7:46 AM, "Joe Krahn" <[email protected]> wrote:

> I agree with James here. I don't see how brackets interfere with parsing
> at all. A list bracket can only be misinterpreted if it follows a data
> name with no intervening space, but that should be invalid anyhow,
> right? Switching to parenthesis is reasonable, but it should not be done
> just because brackets are part of the new STAR/CIF syntax.
> 
> As for the list examples below, why use commas instead of just quoting
> and whitespace delimiters, as in the current STAR syntax? If commas are
> used, commas would become a reserved character and need quoting or escaping.
> 
> I don't have strong opinions either way. My main interest is just to aim
> for a well-defined syntax without any parsing ambiguities.
> 
> Joe
> 
> James Hester wrote:
>> As the syntax that we have been developing now stands, the only reason
>> for not using square brackets is so that it will be possible to
>> correctly parse a CIF in which a space is accidentally missing between
>> a dataname and a bracketed list.  This seems to me to be a pretty
>> minor reason to fiddle with the bracket syntax, but having got that
>> off my chest I don't have any objections to the revised syntax.
>> 
>> NB I believe Nick would like to drop the concept of tuples in DDLm and
>> dREL altogether, with which I also agree.
>> 
>> On Thu, Oct 29, 2009 at 10:31 PM, Herbert J. Bernstein
>> <[email protected]> wrote:
>>> I have no objection to Nick's approach.  I would suggest a straw vote as
>>> soon as possible, so that we can have move forward on coding.  So to be as
>>> specific as possible, here is what I think Nick is proposing:
>>> 
>>>   1.  All the bracketed constructs in a CIF be delimited by {}:
>>>        Lists:  { ..., ... }
>>>        Tuples: { ..., ... }
>>>        Arrays: { ..., ... }
>>>        Tables: { key:value, key:value } with the distinctions among them
>>> made primarily by the type specifications. Note that the key in a table
>>> should be a quoted string.
>>> 
>>>   2.  That array dimensions in a CIF also be delimited by {} as in
>>> {3} or {3,4}
>>> 
>>>   3.  That the same changes be made in dREL
>>> 
>>> (Nick, did I get that right?)
>>> 
>>> I can work with all of the above, and I suspect Nick is right about the
>>> long-term value of consistency here, and reasonably strong typing does
>>> tend to reduce coding errors.  What do other people think?
>>> 
>>> Regards,
>>>   Herbert
>>> 
>>> =====================================================
>>>  Herbert J. Bernstein, Professor of Computer Science
>>>    Dowling College, Kramer Science Center, KSC 121
>>>         Idle Hour Blvd, Oakdale, NY, 11769
>>> 
>>>                  +1-631-244-3035
>>>                  [email protected]
>>> =====================================================
>>> 
>>> On Wed, 28 Oct 2009, Nick Spadaccini wrote:
>>> 
>>>> The move away from [] lists to {} lists (thus overlapping with {}
>>>> associative arrays) had to do with cleaning up the syntax under CIF-2.
>>>> 
>>>> There are legacy issues with existing CIF data names with embedded [] which
>>>> meant that using [ to initiate a list would come unstuck.
>>>> 
>>>> Accordingly to simplify matters and to move forward, I proposed using {} to
>>>> define lists or associative arrays. The complication to the parser is that
>>>> you must start to look inside the object to determine which it is.
>>>> 
>>>> That is CIF. dREL is a different matter but consistency is a good thing, so
>>>> that it makes sense to keep the syntax the same as a CIF data file. Hence
>>>> your transcription of the dREL code is correct. It makes my work a lot more
>>>> difficult of course because until now I just called up a Python parser to
>>>> handle almost all of dREL.
>>>> 
>>>> Such is life.
>>>> 
>>>> 
>>>> On 25/10/09 10:24 PM, "Herbert J. Bernstein" <[email protected]>
>>>> wrote:
>>>> 
>>>>> Dear Colleagues,
>>>>> 
>>>>>    Please take a look at the dictionaries I have drafted at
>>>>> 
>>>>>    http://vcif.sf.net/cif2_dicts
>>>>> 
>>>>> and tell me if I am on the right track in trying to convert to CIF-2
>>>>> format dictionaries.  I have taken all the August 2008 () style tuples in
>>>>> the upper level and converted them to October 2009 CIF-2 {} style lists.
>>>>> I have not changed any array dimension specifications, e.g. [*], nor the
>>>>> innards of any methods.
>>>>> 
>>>>>    Questions:
>>>>> 
>>>>>    1.  Should the dimensions be changed, e.g. from [3] to {3}?
>>>>>    2.  Should there be any changes in dREL methods themselves?
>>>>> 
>>>>> For example consider:
>>>>> 
>>>>> ======
>>>>> save_function.SymEquiv
>>>>>      _definition.id              'function.SymEquiv'
>>>>>      _definition.update           2007-10-11
>>>>>      _description.text
>>>>> ;
>>>>>       The function
>>>>>                       xyz' =  SymEquiv( symop, symcat, xyz )
>>>>> 
>>>>>       returns a fractional coordinate vector xyz' which is input vector
>>>>>       xyz transformed by the input symop 'n_pqr' applied to the symmetry
>>>>>       equivalent matrix extracted from the category symcat.
>>>>> ;
>>>>>      _name.category_id            function
>>>>>      _name.object_id              SymEquiv
>>>>>      _type.purpose                Assigned
>>>>>      _type.container              Array
>>>>>      _type.contents               Real
>>>>>      _type.dimension              [3]
>>>>>       loop_
>>>>>      _method.purpose
>>>>>      _method.expression
>>>>>       Evaluation
>>>>> ;
>>>>>       Function SymEquiv( c :[Single, Symop],    # symop string n_pqr
>>>>>                          l :[Category, Tag],    # loop of symmetry
>>>>> matrices
>>>>>                          x :[Array, Real]    )  # fract coordinate vector
>>>>>       {
>>>>>               s = l [ SymKey( c ) ]
>>>>> 
>>>>>               SymEquiv = s.R * x + s.T + SymLat( c )
>>>>>       }
>>>>> ;
>>>>>      save_
>>>>> ======
>>>>> 
>>>>> Should that remain the same, or should it be as follows
>>>>> 
>>>>> ======
>>>>> save_function.SymEquiv
>>>>>      _definition.id              'function.SymEquiv'
>>>>>      _definition.update           2007-10-11
>>>>>      _description.text
>>>>> ;
>>>>>       The function
>>>>>                       xyz' =  SymEquiv( symop, symcat, xyz )
>>>>> 
>>>>>       returns a fractional coordinate vector xyz' which is input vector
>>>>>       xyz transformed by the input symop 'n_pqr' applied to the symmetry
>>>>>       equivalent matrix extracted from the category symcat.
>>>>> ;
>>>>>      _name.category_id            function
>>>>>      _name.object_id              SymEquiv
>>>>>      _type.purpose                Assigned
>>>>>      _type.container              Array
>>>>>      _type.contents               Real
>>>>>      _type.dimension              {3}
>>>>>       loop_
>>>>>      _method.purpose
>>>>>      _method.expression
>>>>>       Evaluation
>>>>> ;
>>>>>       Function SymEquiv( c :{Single, Symop},    # symop string n_pqr
>>>>>                          l :{Category, Tag},    # loop of symmetry
>>>>> matrices
>>>>>                          x :{Array, Real}    )  # fract coordinate vector
>>>>>       {
>>>>>               s = l { SymKey( c ) }
>>>>> 
>>>>>               SymEquiv = s.R * x + s.T + SymLat( c )
>>>>>       }
>>>>> ;
>>>>>      save_
>>>>> ======
>>>>> 
>>>>> Regards,
>>>>>    Herbert
>>>>> 
>>>>> 
>>>>> =====================================================
>>>>>   Herbert J. Bernstein, Professor of Computer Science
>>>>>     Dowling College, Kramer Science Center, KSC 121
>>>>>          Idle Hour Blvd, Oakdale, NY, 11769
>>>>> 
>>>>>                   +1-631-244-3035
>>>>>                   [email protected]
>>>>> =====================================================
>>>>> 
>>>>> _______________________________________________
>>>>> ddlm-group mailing list
>>>>> [email protected]
>>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>> cheers
>>>> 
>>>> Nick
>>>> 
>>>> --------------------------------
>>>> Associate Professor N. Spadaccini, PhD
>>>> School of Computer Science & Software Engineering
>>>> 
>>>> The University of Western Australia    t: +61 (0)8 6488 3452
>>>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>>>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
>>>> MBDP  M002
>>>> 
>>>> CRICOS Provider Code: 00126G
>>>> 
>>>> e: [email protected]
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> ddlm-group mailing list
>>>> [email protected]
>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>> 
>>> _______________________________________________
>>> ddlm-group mailing list
>>> [email protected]
>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>> 
>> 
>> 
>> 
> 
> _______________________________________________
> ddlm-group mailing list
> [email protected]
> http://scripts.iucr.org/mailman/listinfo/ddlm-group

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: [email protected]




_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]

Follow-Ups:

Re: [ddlm-group] CIF-2 changes (James Hester)

References:

Re: [ddlm-group] CIF-2 changes (Joe Krahn)

Prev by Date: [ddlm-group] THREAD 0 - back in to the breech

Next by Date: Re: [ddlm-group] UTF-8 versus extended ASCII

Prev by thread: Re: [ddlm-group] CIF-2 changes

Next by thread: Re: [ddlm-group] CIF-2 changes

Index(es):

Date

Thread
Discussion List Archives

Re: [ddlm-group] CIF-2 changes