[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] CIF-2 changes
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] CIF-2 changes
- From: Nick Spadaccini <nick@csse.uwa.edu.au>
- Date: Mon, 09 Nov 2009 09:55:04 +0800
- Authentication-Results: postfix;
- In-Reply-To: <4AEA2972.4070501@niehs.nih.gov>
James and Joe are correct on this point. The dropping of [] was for reasons of ease to older CIF1 files. BUT absolutely it introduces problems also, while trying to ease other parts of the parsing process. I don't know if my thinking was mature enough on this issue when I suggested the change. Let me make my position clear. I WOULD MUCH PREFER to have lists defined by square brackets and associative arrays by curly brackets. In this way the parser can determine at the purely lexical level that it is in a list or an associative array on reading the first [ or { when it is in the context. My thinking for making both delimited by { came from the fact that there are existing datanames with embedded [ and a CIF2 parser will take this to be the beginning of a list. To simplify this parsing I suggested removing [] from the set of disallowed characters. Joe K quite correctly states that in a CIF2 file there can be no [] in a dataname so it will be safe. After this thread there was discussion on a leading comment identifying a file as CIF2. IF THIS IS present the dilemma is removed. At the first line of the parse we know whether to drop in to the CIF1 or the CIF2 lexical rules of our parser. BUT I am NOT sure if we MANDATED this first line comment. An alternative is to (essentially) require a re-parse to determine whether the file is CIF1 or CIF2. Such a pre-parser cannot assume either rule set, but go through the first X lines, character by character until it can confidently conclude it is one or the other. Either way these approaches remove the problem of CIF1 from the syntactic specification of CIF2 (again something I would prefer to do). We should vote on this since it will make the issue concrete. We can employ square brackets to identify lists if we abstract away the issue of existing CIF1 datanames to a higher level. Which is a moot point anyway because there are other aspects of CIF1 that break CIF2 parsers that we need to deal with. Finally employing [] makes it much easier to cast everything into Python (though this is just a convenience and not a critical reason for employing them). And yes, tuples have been dropped from the CIF2 data types. Immutability of a tuple is an implementation issue and not a representation issue. In terms of representation it makes no difference to call a CIF object a tuple or a list. On 30/10/09 7:46 AM, "Joe Krahn" <krahn@niehs.nih.gov> wrote: > I agree with James here. I don't see how brackets interfere with parsing > at all. A list bracket can only be misinterpreted if it follows a data > name with no intervening space, but that should be invalid anyhow, > right? Switching to parenthesis is reasonable, but it should not be done > just because brackets are part of the new STAR/CIF syntax. > > As for the list examples below, why use commas instead of just quoting > and whitespace delimiters, as in the current STAR syntax? If commas are > used, commas would become a reserved character and need quoting or escaping. > > I don't have strong opinions either way. My main interest is just to aim > for a well-defined syntax without any parsing ambiguities. > > Joe > > James Hester wrote: >> As the syntax that we have been developing now stands, the only reason >> for not using square brackets is so that it will be possible to >> correctly parse a CIF in which a space is accidentally missing between >> a dataname and a bracketed list. This seems to me to be a pretty >> minor reason to fiddle with the bracket syntax, but having got that >> off my chest I don't have any objections to the revised syntax. >> >> NB I believe Nick would like to drop the concept of tuples in DDLm and >> dREL altogether, with which I also agree. >> >> On Thu, Oct 29, 2009 at 10:31 PM, Herbert J. Bernstein >> <yaya@bernstein-plus-sons.com> wrote: >>> I have no objection to Nick's approach. I would suggest a straw vote as >>> soon as possible, so that we can have move forward on coding. So to be as >>> specific as possible, here is what I think Nick is proposing: >>> >>> 1. All the bracketed constructs in a CIF be delimited by {}: >>> Lists: { ..., ... } >>> Tuples: { ..., ... } >>> Arrays: { ..., ... } >>> Tables: { key:value, key:value } with the distinctions among them >>> made primarily by the type specifications. Note that the key in a table >>> should be a quoted string. >>> >>> 2. That array dimensions in a CIF also be delimited by {} as in >>> {3} or {3,4} >>> >>> 3. That the same changes be made in dREL >>> >>> (Nick, did I get that right?) >>> >>> I can work with all of the above, and I suspect Nick is right about the >>> long-term value of consistency here, and reasonably strong typing does >>> tend to reduce coding errors. What do other people think? >>> >>> Regards, >>> Herbert >>> >>> ===================================================== >>> Herbert J. Bernstein, Professor of Computer Science >>> Dowling College, Kramer Science Center, KSC 121 >>> Idle Hour Blvd, Oakdale, NY, 11769 >>> >>> +1-631-244-3035 >>> yaya@dowling.edu >>> ===================================================== >>> >>> On Wed, 28 Oct 2009, Nick Spadaccini wrote: >>> >>>> The move away from [] lists to {} lists (thus overlapping with {} >>>> associative arrays) had to do with cleaning up the syntax under CIF-2. >>>> >>>> There are legacy issues with existing CIF data names with embedded [] which >>>> meant that using [ to initiate a list would come unstuck. >>>> >>>> Accordingly to simplify matters and to move forward, I proposed using {} to >>>> define lists or associative arrays. The complication to the parser is that >>>> you must start to look inside the object to determine which it is. >>>> >>>> That is CIF. dREL is a different matter but consistency is a good thing, so >>>> that it makes sense to keep the syntax the same as a CIF data file. Hence >>>> your transcription of the dREL code is correct. It makes my work a lot more >>>> difficult of course because until now I just called up a Python parser to >>>> handle almost all of dREL. >>>> >>>> Such is life. >>>> >>>> >>>> On 25/10/09 10:24 PM, "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com> >>>> wrote: >>>> >>>>> Dear Colleagues, >>>>> >>>>> Please take a look at the dictionaries I have drafted at >>>>> >>>>> http://vcif.sf.net/cif2_dicts >>>>> >>>>> and tell me if I am on the right track in trying to convert to CIF-2 >>>>> format dictionaries. I have taken all the August 2008 () style tuples in >>>>> the upper level and converted them to October 2009 CIF-2 {} style lists. >>>>> I have not changed any array dimension specifications, e.g. [*], nor the >>>>> innards of any methods. >>>>> >>>>> Questions: >>>>> >>>>> 1. Should the dimensions be changed, e.g. from [3] to {3}? >>>>> 2. Should there be any changes in dREL methods themselves? >>>>> >>>>> For example consider: >>>>> >>>>> ====== >>>>> save_function.SymEquiv >>>>> _definition.id 'function.SymEquiv' >>>>> _definition.update 2007-10-11 >>>>> _description.text >>>>> ; >>>>> The function >>>>> xyz' = SymEquiv( symop, symcat, xyz ) >>>>> >>>>> returns a fractional coordinate vector xyz' which is input vector >>>>> xyz transformed by the input symop 'n_pqr' applied to the symmetry >>>>> equivalent matrix extracted from the category symcat. >>>>> ; >>>>> _name.category_id function >>>>> _name.object_id SymEquiv >>>>> _type.purpose Assigned >>>>> _type.container Array >>>>> _type.contents Real >>>>> _type.dimension [3] >>>>> loop_ >>>>> _method.purpose >>>>> _method.expression >>>>> Evaluation >>>>> ; >>>>> Function SymEquiv( c :[Single, Symop], # symop string n_pqr >>>>> l :[Category, Tag], # loop of symmetry >>>>> matrices >>>>> x :[Array, Real] ) # fract coordinate vector >>>>> { >>>>> s = l [ SymKey( c ) ] >>>>> >>>>> SymEquiv = s.R * x + s.T + SymLat( c ) >>>>> } >>>>> ; >>>>> save_ >>>>> ====== >>>>> >>>>> Should that remain the same, or should it be as follows >>>>> >>>>> ====== >>>>> save_function.SymEquiv >>>>> _definition.id 'function.SymEquiv' >>>>> _definition.update 2007-10-11 >>>>> _description.text >>>>> ; >>>>> The function >>>>> xyz' = SymEquiv( symop, symcat, xyz ) >>>>> >>>>> returns a fractional coordinate vector xyz' which is input vector >>>>> xyz transformed by the input symop 'n_pqr' applied to the symmetry >>>>> equivalent matrix extracted from the category symcat. >>>>> ; >>>>> _name.category_id function >>>>> _name.object_id SymEquiv >>>>> _type.purpose Assigned >>>>> _type.container Array >>>>> _type.contents Real >>>>> _type.dimension {3} >>>>> loop_ >>>>> _method.purpose >>>>> _method.expression >>>>> Evaluation >>>>> ; >>>>> Function SymEquiv( c :{Single, Symop}, # symop string n_pqr >>>>> l :{Category, Tag}, # loop of symmetry >>>>> matrices >>>>> x :{Array, Real} ) # fract coordinate vector >>>>> { >>>>> s = l { SymKey( c ) } >>>>> >>>>> SymEquiv = s.R * x + s.T + SymLat( c ) >>>>> } >>>>> ; >>>>> save_ >>>>> ====== >>>>> >>>>> Regards, >>>>> Herbert >>>>> >>>>> >>>>> ===================================================== >>>>> Herbert J. Bernstein, Professor of Computer Science >>>>> Dowling College, Kramer Science Center, KSC 121 >>>>> Idle Hour Blvd, Oakdale, NY, 11769 >>>>> >>>>> +1-631-244-3035 >>>>> yaya@dowling.edu >>>>> ===================================================== >>>>> >>>>> _______________________________________________ >>>>> ddlm-group mailing list >>>>> ddlm-group@iucr.org >>>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>> cheers >>>> >>>> Nick >>>> >>>> -------------------------------- >>>> Associate Professor N. Spadaccini, PhD >>>> School of Computer Science & Software Engineering >>>> >>>> The University of Western Australia t: +61 (0)8 6488 3452 >>>> 35 Stirling Highway f: +61 (0)8 6488 1089 >>>> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick >>>> MBDP M002 >>>> >>>> CRICOS Provider Code: 00126G >>>> >>>> e: Nick.Spadaccini@uwa.edu.au >>>> >>>> >>>> >>>> >>>> _______________________________________________ >>>> ddlm-group mailing list >>>> ddlm-group@iucr.org >>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>>> >>> _______________________________________________ >>> ddlm-group mailing list >>> ddlm-group@iucr.org >>> http://scripts.iucr.org/mailman/listinfo/ddlm-group >>> >> >> >> > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group cheers Nick -------------------------------- Associate Professor N. Spadaccini, PhD School of Computer Science & Software Engineering The University of Western Australia t: +61 (0)8 6488 3452 35 Stirling Highway f: +61 (0)8 6488 1089 CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick MBDP M002 CRICOS Provider Code: 00126G e: Nick.Spadaccini@uwa.edu.au _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] CIF-2 changes (James Hester)
- References:
- Re: [ddlm-group] CIF-2 changes (Joe Krahn)
- Prev by Date: [ddlm-group] THREAD 0 - back in to the breech
- Next by Date: Re: [ddlm-group] UTF-8 versus extended ASCII
- Prev by thread: Re: [ddlm-group] CIF-2 changes
- Next by thread: Re: [ddlm-group] CIF-2 changes
- Index(es):