[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
From: Nick Spadaccini <nick@csse.uwa.edu.au>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Friday, 27 November, 2009 7:46:44
Subject: Re: [ddlm-group] Space as a list item separator
On 27/11/09 2:32 PM, "James Hester" <jamesrhester@gmail.com> wrote:
> See comments below:
>
> On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini <nick@csse.uwa.edu.au> wrote:
>> Timely email, come in just after the one I sent.
>>
>> My position is if we specify the syntax then we encourage its correct use but
>> acknowledge that there may be cases where one might be able to recover
>> intent. But I wouldnąt encourage those cases.
>
> Absolutely, which is why I would like to elevate space-separated list items to
> be correct syntax rather than 'wrong but intent is clear' syntax.
>>
>> You could say that token separator in lists are a or b or c, but that just
>> adds a level of complexity for very little gain. The choice of comma makes it
>> seamless to translate from the raw CIF data straight in to most language
>> specific data declaration. The only language I know that accepts one or the
>> other or both is MatLab.
>
> Re ease of translation: you speak as if a viable approach to a CIF data file
> is to take whole text chunks and throw them at some language interpreter,
> without doing your own parse. Quite apart from being a rather unlikely
> approach, this is impossible, as without parsing you won't know where the list
> finishes. If you do do your own parse, you can populate your datastructures
> directly during the parse, and what list separator was originally used in the
> data file is completely irrelevant.
>
> Re complexity: not sure how you are planning to deal with whitespace in the
> formal grammar, but consider the following, where I have assumed that each
> token 'eats up' the following whitespace.
>
> <dataitem> = <dataname><whitespace>+<datavalue>
> <datavalue> = {<list>|<string>}<whitespace>+
> <listdatavalue> = {<list>|<string>}<whitespace>*
> <list> = '[' <whitespace>* {<listdatavalue>
> {<comma><whitespace>*<listdatavalue>}*}* ']'
>
> If we make comma or whitespace possible separators, the last production
> becomes:
> <list> = '[' <whitespace>* {<listdatavalue> {<comma or
> whitespace><listdatavalue>}*}* ']'
>
> This looks like no extra complexity, and from a user's point of view
> whitespace as an alternative separator is simple to understand and consistent
> with space as a token separator used everywhere else in CIF. Anyway, if
> reduction of grammar complexity is your goal, you can just completely exclude
> commas as list separators!
Why not? Make them spaces only, and you become consistent across the board.
I have to think about the possibility of pathological cases where spaces
won't work. I can't think of any at the moment.
>
> Some questions about how commas behave:
> 1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
> 2. are two commas in a row a syntax error? E.g. [1,2,3,,4]
I would say yes to syntax error. I an easily determine they may need to be
an additional list value, but can't determine what.
> Note the above productions assume that the answer to both is yes.
>
>>
>> What big advantage to a language is there to specify you can use a comma or
>> whitespace as a token separator? Will you be happy with the first person who
>> interprets this as being ok
>>
>> loop_
>> _severalvalues 1,2,3,4,5,6,7 # these being the 7 values of severalvalues
>>
> Note sure what you are getting at here: I am proposing the following:
>
> _nicelist [1 2 3 4 5 6 7]
>
> being the same as
>
> _nicelist [1,2,3,4,5,6,7]
>
> Don't see how this relates to loops.
The point was, once you say a space and comma are equivalent token
separators then will it be an interpretation that they are always so even in
loops? My example was not a list, just 7 values that were separated by
commas not spaces.
>
> James.
> ------
>>
>> On 27/11/09 11:41 AM, "James Hester" <jamesrhester@gmail.com
>> <http://jamesrhester@gmail.com> > wrote:
>>
>>> Dear All: looking over the list I posted previously of items left to
>>> resolve, I see only one serious one outstanding: whether or not to allow
>>> space as a separator between list items. Nick has stated:
>>>
>>> " I will propose it has to be a comma, but make the coercion rule that space
>>> separated values in a list-type object be coerced into comma separated
>>> values. That is, read spaces as you want, but don't encourage them."
>>>
>>> I would like to counter-propose, as Joe did originally, that whitespace be
>>> elevated to equal status with comma as a valid list separator. I see no
>>> downside to this. Would anyone else like to speak to this issue before we
>>> vote? In particular, I would be interested to hear why Nick doesn't want to
>>> encourage spaces.
>>
>> cheers
>>
>> Nick
>>
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>>
>> The University of Western Australia t: +61 (0)8 6488 3452
>> 35 Stirling Highway f: +61 (0)8 6488 1089
>> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
>> <http://www.csse.uwa.edu.au/%7Enick>
>> MBDP M002
>>
>> CRICOS Provider Code: 00126G
>>
>> e: Nick.Spadaccini@uwa.edu.au <http://Nick.Spadaccini@uwa.edu.au>
>>
>>
>>
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
>
cheers
Nick
--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering
The University of Western Australia t: +61 (0)8 6488 3452
35 Stirling Highway f: +61 (0)8 6488 1089
CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
MBDP M002
CRICOS Provider Code: 00126G
e: Nick.Spadaccini@uwa.edu.au
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
Re: [ddlm-group] Space as a list item separator
- To: Nick.Spadaccini@uwa.edu.au, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Space as a list item separator
- From: SIMON WESTRIP <simonwestrip@btinternet.com>
- Date: Fri, 27 Nov 2009 09:20:25 +0000 (GMT)
- In-Reply-To: <C735A4E4.12669%nick@csse.uwa.edu.au>
- References: <C735A4E4.12669%nick@csse.uwa.edu.au>
At first glance, you're considering using space instead of commas as list separators?
which is not so far away from the CIF1 requirement of space following a delimiter?
But I'm only on my first cup of coffee this morning :-)
which is not so far away from the CIF1 requirement of space following a delimiter?
But I'm only on my first cup of coffee this morning :-)
From: Nick Spadaccini <nick@csse.uwa.edu.au>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Friday, 27 November, 2009 7:46:44
Subject: Re: [ddlm-group] Space as a list item separator
On 27/11/09 2:32 PM, "James Hester" <jamesrhester@gmail.com> wrote:
> See comments below:
>
> On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini <nick@csse.uwa.edu.au> wrote:
>> Timely email, come in just after the one I sent.
>>
>> My position is if we specify the syntax then we encourage its correct use but
>> acknowledge that there may be cases where one might be able to recover
>> intent. But I wouldnąt encourage those cases.
>
> Absolutely, which is why I would like to elevate space-separated list items to
> be correct syntax rather than 'wrong but intent is clear' syntax.
>>
>> You could say that token separator in lists are a or b or c, but that just
>> adds a level of complexity for very little gain. The choice of comma makes it
>> seamless to translate from the raw CIF data straight in to most language
>> specific data declaration. The only language I know that accepts one or the
>> other or both is MatLab.
>
> Re ease of translation: you speak as if a viable approach to a CIF data file
> is to take whole text chunks and throw them at some language interpreter,
> without doing your own parse. Quite apart from being a rather unlikely
> approach, this is impossible, as without parsing you won't know where the list
> finishes. If you do do your own parse, you can populate your datastructures
> directly during the parse, and what list separator was originally used in the
> data file is completely irrelevant.
>
> Re complexity: not sure how you are planning to deal with whitespace in the
> formal grammar, but consider the following, where I have assumed that each
> token 'eats up' the following whitespace.
>
> <dataitem> = <dataname><whitespace>+<datavalue>
> <datavalue> = {<list>|<string>}<whitespace>+
> <listdatavalue> = {<list>|<string>}<whitespace>*
> <list> = '[' <whitespace>* {<listdatavalue>
> {<comma><whitespace>*<listdatavalue>}*}* ']'
>
> If we make comma or whitespace possible separators, the last production
> becomes:
> <list> = '[' <whitespace>* {<listdatavalue> {<comma or
> whitespace><listdatavalue>}*}* ']'
>
> This looks like no extra complexity, and from a user's point of view
> whitespace as an alternative separator is simple to understand and consistent
> with space as a token separator used everywhere else in CIF. Anyway, if
> reduction of grammar complexity is your goal, you can just completely exclude
> commas as list separators!
Why not? Make them spaces only, and you become consistent across the board.
I have to think about the possibility of pathological cases where spaces
won't work. I can't think of any at the moment.
>
> Some questions about how commas behave:
> 1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
> 2. are two commas in a row a syntax error? E.g. [1,2,3,,4]
I would say yes to syntax error. I an easily determine they may need to be
an additional list value, but can't determine what.
> Note the above productions assume that the answer to both is yes.
>
>>
>> What big advantage to a language is there to specify you can use a comma or
>> whitespace as a token separator? Will you be happy with the first person who
>> interprets this as being ok
>>
>> loop_
>> _severalvalues 1,2,3,4,5,6,7 # these being the 7 values of severalvalues
>>
> Note sure what you are getting at here: I am proposing the following:
>
> _nicelist [1 2 3 4 5 6 7]
>
> being the same as
>
> _nicelist [1,2,3,4,5,6,7]
>
> Don't see how this relates to loops.
The point was, once you say a space and comma are equivalent token
separators then will it be an interpretation that they are always so even in
loops? My example was not a list, just 7 values that were separated by
commas not spaces.
>
> James.
> ------
>>
>> On 27/11/09 11:41 AM, "James Hester" <jamesrhester@gmail.com
>> <http://jamesrhester@gmail.com> > wrote:
>>
>>> Dear All: looking over the list I posted previously of items left to
>>> resolve, I see only one serious one outstanding: whether or not to allow
>>> space as a separator between list items. Nick has stated:
>>>
>>> " I will propose it has to be a comma, but make the coercion rule that space
>>> separated values in a list-type object be coerced into comma separated
>>> values. That is, read spaces as you want, but don't encourage them."
>>>
>>> I would like to counter-propose, as Joe did originally, that whitespace be
>>> elevated to equal status with comma as a valid list separator. I see no
>>> downside to this. Would anyone else like to speak to this issue before we
>>> vote? In particular, I would be interested to hear why Nick doesn't want to
>>> encourage spaces.
>>
>> cheers
>>
>> Nick
>>
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>>
>> The University of Western Australia t: +61 (0)8 6488 3452
>> 35 Stirling Highway f: +61 (0)8 6488 1089
>> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
>> <http://www.csse.uwa.edu.au/%7Enick>
>> MBDP M002
>>
>> CRICOS Provider Code: 00126G
>>
>> e: Nick.Spadaccini@uwa.edu.au <http://Nick.Spadaccini@uwa.edu.au>
>>
>>
>>
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
>
cheers
Nick
--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering
The University of Western Australia t: +61 (0)8 6488 3452
35 Stirling Highway f: +61 (0)8 6488 1089
CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
MBDP M002
CRICOS Provider Code: 00126G
e: Nick.Spadaccini@uwa.edu.au
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Space as a list item separator (Herbert J. Bernstein)
- References:
- Re: [ddlm-group] Space as a list item separator (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] Space as a list item separator
- Next by Date: Re: [ddlm-group] Space as a list item separator
- Prev by thread: Re: [ddlm-group] Space as a list item separator
- Next by thread: Re: [ddlm-group] Space as a list item separator
- Index(es):