[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Space as a list item separator




On 27/11/09 2:32 PM, "James Hester" <jamesrhester@gmail.com> wrote:

> See comments below:
> 
> On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini <nick@csse.uwa.edu.au> wrote:
>> Timely email, come in just after the one I sent.
>> 
>> My position is if we specify the syntax then we encourage its correct use but
>> acknowledge that there may be cases where one might be able to recover
>> intent. But I wouldnąt encourage those cases.
> 
> Absolutely, which is why I would like to elevate space-separated list items to
> be correct syntax rather than 'wrong but intent is clear' syntax.
>> 
>> You could say that token separator in lists are a or b or c, but that just
>> adds a level of complexity for very little gain. The choice of comma makes it
>> seamless to translate from the raw CIF data straight in to most language
>> specific data declaration. The only language I know that accepts one or the
>> other or both is MatLab.
> 
> Re ease of translation: you speak as if a viable approach to a CIF data file
> is to take whole text chunks and throw them at some language interpreter,
> without doing your own parse.  Quite apart from being a rather unlikely
> approach, this is impossible, as without parsing you won't know where the list
> finishes.  If you do do your own parse, you can populate your datastructures
> directly during the parse, and what list separator was originally used in the
> data file is completely irrelevant.
> 
> Re complexity: not sure how you are planning to deal with whitespace in the
> formal grammar, but consider the following, where I have assumed that each
> token 'eats up' the following whitespace.
> 
> <dataitem> = <dataname><whitespace>+<datavalue>
> <datavalue> = {<list>|<string>}<whitespace>+
> <listdatavalue> = {<list>|<string>}<whitespace>*
> <list> = '[' <whitespace>* {<listdatavalue>
> {<comma><whitespace>*<listdatavalue>}*}* ']'
> 
> If we make comma or whitespace possible separators, the last production
> becomes:
> <list> =  '[' <whitespace>* {<listdatavalue> {<comma or
> whitespace><listdatavalue>}*}* ']'
> 
> This looks like no extra complexity, and from a user's point of view
> whitespace as an alternative separator is simple to understand and consistent
> with space as a token separator used everywhere else in CIF.  Anyway, if
> reduction of grammar complexity is your goal, you can just completely exclude
> commas as list separators!

Why not? Make them spaces only, and you become consistent across the board.
I have to think about the possibility of pathological cases where spaces
won't work. I can't think of any at the moment.

> 
> Some questions about how commas behave:
> 1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
> 2. are two commas in a row a syntax error? E.g. [1,2,3,,4]

I would say yes to syntax error. I an easily determine they may need to be
an additional list value, but can't determine what.
 
> Note the above productions assume that the answer to both is yes.
> 
>> 
>> What big advantage to a language is there to specify you can use a comma or
>> whitespace as a token separator? Will you be happy with the first person who
>> interprets this as being ok
>> 
>> loop_
>>   _severalvalues 1,2,3,4,5,6,7 # these being the 7 values of severalvalues
>> 
> Note sure what you are getting at here: I am proposing the following:
> 
> _nicelist      [1 2 3 4 5 6 7]
> 
> being the same as
> 
> _nicelist      [1,2,3,4,5,6,7]
> 
>  Don't see how this relates to loops.

The point was, once you say a space and comma are equivalent token
separators then will it be an interpretation that they are always so even in
loops? My example was not a list, just 7 values that were separated by
commas not spaces.

> 
> James.
> ------
>> 
>> On 27/11/09 11:41 AM, "James Hester" <jamesrhester@gmail.com
>> <http://jamesrhester@gmail.com> > wrote:
>> 
>>> Dear All: looking over the list I posted previously of items left to
>>> resolve, I see only one serious one outstanding: whether or not to allow
>>> space as a separator between list items.  Nick has stated:
>>> 
>>> " I will propose it has to be a comma, but make the coercion rule that space
>>> separated values in a list-type object be coerced into comma separated
>>> values. That is, read spaces as you want, but don't encourage them."
>>> 
>>> I would like to counter-propose, as Joe did originally, that whitespace be
>>> elevated to equal status with comma as a valid list separator.  I see no
>>> downside to this.  Would anyone else like to speak to this issue before we
>>> vote?  In particular, I would be interested to hear why Nick doesn't want to
>>> encourage spaces.
>> 
>> cheers
>> 
>> Nick
>> 
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>> 
>> The University of Western Australia    t: +61 (0)8 6488 3452
>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
>> <http://www.csse.uwa.edu.au/%7Enick>
>> MBDP  M002
>> 
>> CRICOS Provider Code: 00126G
>> 
>> e: Nick.Spadaccini@uwa.edu.au <http://Nick.Spadaccini@uwa.edu.au>
>> 
>> 
>> 
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> 
> 
> 

cheers

Nick

--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering

The University of Western Australia    t: +61 (0)8 6488 3452
35 Stirling Highway                    f: +61 (0)8 6488 1089
CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
MBDP  M002

CRICOS Provider Code: 00126G

e: Nick.Spadaccini@uwa.edu.au




_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group


Reply to: [list | sender only]