Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] CIF-2 changes

James and Joe are correct on this point. The dropping of [] was for reasons
of ease to older CIF1 files. BUT absolutely it introduces problems also,
while trying to ease other parts of the parsing process. I don't know if my
thinking was mature enough on this issue when I suggested the change.

Let me make my position clear. I WOULD MUCH PREFER to have lists defined by
square brackets and associative arrays by curly brackets. In this way the
parser can determine at the purely lexical level that it is in a list or an
associative array on reading the first [ or { when it is in the context.

My thinking for making both delimited by { came from the fact that there are
existing datanames with embedded [ and a CIF2 parser will take this to be
the beginning of a list. To simplify this parsing I suggested removing []
from the set of disallowed characters. Joe K quite correctly states that in
a CIF2 file there can be no [] in a dataname so it will be safe.

After this thread there was discussion on a leading comment identifying a
file as CIF2. IF THIS IS present the dilemma is removed. At the first line
of the parse we know whether to drop in to the CIF1 or the CIF2 lexical
rules of our parser. BUT I am NOT sure if we MANDATED this first line
comment. An alternative is to (essentially) require a re-parse to determine
whether the file is CIF1 or CIF2. Such a pre-parser cannot assume either
rule set, but go through the first X lines, character by character until it
can confidently conclude it is one or the other.

Either way these approaches remove the problem of CIF1 from the syntactic
specification of CIF2 (again something I would prefer to do).

We should vote on this since it will make the issue concrete. We can employ
square brackets to identify lists if we abstract away the issue of existing
CIF1 datanames to a higher level. Which is a moot point anyway because there
are other aspects of CIF1 that break CIF2 parsers that we need to deal with.

Finally employing [] makes it much easier to cast everything into Python
(though this is just a convenience and not a critical reason for employing
them).

And yes, tuples have been dropped from the CIF2 data types. Immutability of
a tuple is an implementation issue and not a representation issue. In terms
of representation it makes no difference to call a CIF object a tuple or a
list.


On 30/10/09 7:46 AM, "Joe Krahn" <krahn@niehs.nih.gov> wrote:

> I agree with James here. I don't see how brackets interfere with parsing
> at all. A list bracket can only be misinterpreted if it follows a data
> name with no intervening space, but that should be invalid anyhow,
> right? Switching to parenthesis is reasonable, but it should not be done
> just because brackets are part of the new STAR/CIF syntax.
> 
> As for the list examples below, why use commas instead of just quoting
> and whitespace delimiters, as in the current STAR syntax? If commas are
> used, commas would become a reserved character and need quoting or escaping.
> 
> I don't have strong opinions either way. My main interest is just to aim
> for a well-defined syntax without any parsing ambiguities.
> 
> Joe
> 
> James Hester wrote:
>> As the syntax that we have been developing now stands, the only reason
>> for not using square brackets is so that it will be possible to
>> correctly parse a CIF in which a space is accidentally missing between
>> a dataname and a bracketed list.  This seems to me to be a pretty
>> minor reason to fiddle with the bracket syntax, but having got that
>> off my chest I don't have any objections to the revised syntax.
>> 
>> NB I believe Nick would like to drop the concept of tuples in DDLm and
>> dREL altogether, with which I also agree.
>> 
>> On Thu, Oct 29, 2009 at 10:31 PM, Herbert J. Bernstein
>> <yaya@bernstein-plus-sons.com> wrote:
>>> I have no objection to Nick's approach.  I would suggest a straw vote as
>>> soon as possible, so that we can have move forward on coding.  So to be as
>>> specific as possible, here is what I think Nick is proposing:
>>> 
>>>   1.  All the bracketed constructs in a CIF be delimited by {}:
>>>        Lists:  { ..., ... }
>>>        Tuples: { ..., ... }
>>>        Arrays: { ..., ... }
>>>        Tables: { key:value, key:value } with the distinctions among them
>>> made primarily by the type specifications. Note that the key in a table
>>> should be a quoted string.
>>> 
>>>   2.  That array dimensions in a CIF also be delimited by {} as in
>>> {3} or {3,4}
>>> 
>>>   3.  That the same changes be made in dREL
>>> 
>>> (Nick, did I get that right?)
>>> 
>>> I can work with all of the above, and I suspect Nick is right about the
>>> long-term value of consistency here, and reasonably strong typing does
>>> tend to reduce coding errors.  What do other people think?
>>> 
>>> Regards,
>>>   Herbert
>>> 
>>> =====================================================
>>>  Herbert J. Bernstein, Professor of Computer Science
>>>    Dowling College, Kramer Science Center, KSC 121
>>>         Idle Hour Blvd, Oakdale, NY, 11769
>>> 
>>>                  +1-631-244-3035
>>>                  yaya@dowling.edu
>>> =====================================================
>>> 
>>> On Wed, 28 Oct 2009, Nick Spadaccini wrote:
>>> 
>>>> The move away from [] lists to {} lists (thus overlapping with {}
>>>> associative arrays) had to do with cleaning up the syntax under CIF-2.
>>>> 
>>>> There are legacy issues with existing CIF data names with embedded [] which
>>>> meant that using [ to initiate a list would come unstuck.
>>>> 
>>>> Accordingly to simplify matters and to move forward, I proposed using {} to
>>>> define lists or associative arrays. The complication to the parser is that
>>>> you must start to look inside the object to determine which it is.
>>>> 
>>>> That is CIF. dREL is a different matter but consistency is a good thing, so
>>>> that it makes sense to keep the syntax the same as a CIF data file. Hence
>>>> your transcription of the dREL code is correct. It makes my work a lot more
>>>> difficult of course because until now I just called up a Python parser to
>>>> handle almost all of dREL.
>>>> 
>>>> Such is life.
>>>> 
>>>> 
>>>> On 25/10/09 10:24 PM, "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
>>>> wrote:
>>>> 
>>>>> Dear Colleagues,
>>>>> 
>>>>>    Please take a look at the dictionaries I have drafted at
>>>>> 
>>>>>    http://vcif.sf.net/cif2_dicts
>>>>> 
>>>>> and tell me if I am on the right track in trying to convert to CIF-2
>>>>> format dictionaries.  I have taken all the August 2008 () style tuples in
>>>>> the upper level and converted them to October 2009 CIF-2 {} style lists.
>>>>> I have not changed any array dimension specifications, e.g. [*], nor the
>>>>> innards of any methods.
>>>>> 
>>>>>    Questions:
>>>>> 
>>>>>    1.  Should the dimensions be changed, e.g. from [3] to {3}?
>>>>>    2.  Should there be any changes in dREL methods themselves?
>>>>> 
>>>>> For example consider:
>>>>> 
>>>>> ======
>>>>> save_function.SymEquiv
>>>>>      _definition.id              'function.SymEquiv'
>>>>>      _definition.update           2007-10-11
>>>>>      _description.text
>>>>> ;
>>>>>       The function
>>>>>                       xyz' =  SymEquiv( symop, symcat, xyz )
>>>>> 
>>>>>       returns a fractional coordinate vector xyz' which is input vector
>>>>>       xyz transformed by the input symop 'n_pqr' applied to the symmetry
>>>>>       equivalent matrix extracted from the category symcat.
>>>>> ;
>>>>>      _name.category_id            function
>>>>>      _name.object_id              SymEquiv
>>>>>      _type.purpose                Assigned
>>>>>      _type.container              Array
>>>>>      _type.contents               Real
>>>>>      _type.dimension              [3]
>>>>>       loop_
>>>>>      _method.purpose
>>>>>      _method.expression
>>>>>       Evaluation
>>>>> ;
>>>>>       Function SymEquiv( c :[Single, Symop],    # symop string n_pqr
>>>>>                          l :[Category, Tag],    # loop of symmetry
>>>>> matrices
>>>>>                          x :[Array, Real]    )  # fract coordinate vector
>>>>>       {
>>>>>               s = l [ SymKey( c ) ]
>>>>> 
>>>>>               SymEquiv = s.R * x + s.T + SymLat( c )
>>>>>       }
>>>>> ;
>>>>>      save_
>>>>> ======
>>>>> 
>>>>> Should that remain the same, or should it be as follows
>>>>> 
>>>>> ======
>>>>> save_function.SymEquiv
>>>>>      _definition.id              'function.SymEquiv'
>>>>>      _definition.update           2007-10-11
>>>>>      _description.text
>>>>> ;
>>>>>       The function
>>>>>                       xyz' =  SymEquiv( symop, symcat, xyz )
>>>>> 
>>>>>       returns a fractional coordinate vector xyz' which is input vector
>>>>>       xyz transformed by the input symop 'n_pqr' applied to the symmetry
>>>>>       equivalent matrix extracted from the category symcat.
>>>>> ;
>>>>>      _name.category_id            function
>>>>>      _name.object_id              SymEquiv
>>>>>      _type.purpose                Assigned
>>>>>      _type.container              Array
>>>>>      _type.contents               Real
>>>>>      _type.dimension              {3}
>>>>>       loop_
>>>>>      _method.purpose
>>>>>      _method.expression
>>>>>       Evaluation
>>>>> ;
>>>>>       Function SymEquiv( c :{Single, Symop},    # symop string n_pqr
>>>>>                          l :{Category, Tag},    # loop of symmetry
>>>>> matrices
>>>>>                          x :{Array, Real}    )  # fract coordinate vector
>>>>>       {
>>>>>               s = l { SymKey( c ) }
>>>>> 
>>>>>               SymEquiv = s.R * x + s.T + SymLat( c )
>>>>>       }
>>>>> ;
>>>>>      save_
>>>>> ======
>>>>> 
>>>>> Regards,
>>>>>    Herbert
>>>>> 
>>>>> 
>>>>> =====================================================
>>>>>   Herbert J. Bernstein, Professor of Computer Science
>>>>>     Dowling College, Kramer Science Center, KSC 121
>>>>>          Idle Hour Blvd, Oakdale, NY, 11769
>>>>> 
>>>>>                   +1-631-244-3035
>>>>>                   yaya@dowling.edu
>>>>> =====================================================
>>>>> 
>>>>> _______________________________________________
>>>>> ddlm-group mailing list
>>>>> ddlm-group@iucr.org
>>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>> cheers
>>>> 
>>>> Nick
>>>> 
>>>> --------------------------------
>>>> Associate Professor N. Spadaccini, PhD
>>>> School of Computer Science & Software Engineering
>>>> 
>>>> The University of Western Australia    t: +61 (0)8 6488 3452
>>>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>>>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
>>>> MBDP  M002
>>>> 
>>>> CRICOS Provider Code: 00126G
>>>> 
>>>> e: Nick.Spadaccini@uwa.edu.au
>>>> 
>>>> 
>>>> 
>>>> 
>>>> _______________________________________________
>>>> ddlm-group mailing list
>>>> ddlm-group@iucr.org
>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>> 
>>> _______________________________________________
>>> ddlm-group mailing list
>>> ddlm-group@iucr.org
>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>> 
>> 
>> 
>> 
> 
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au




_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.