[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
Reply to: [list | sender only]
Re: [ddlm-group] Space as a list item separator
- To: Nick.Spadaccini@uwa.edu.au, Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Space as a list item separator
- From: James Hester <jamesrhester@gmail.com>
- Date: Fri, 27 Nov 2009 17:32:53 +1100
- In-Reply-To: <C7357208.1265F%nick@csse.uwa.edu.au>
- References: <279aad2a0911261941o6f76b12aq156fef9f7eed2376@mail.gmail.com><C7357208.1265F%nick@csse.uwa.edu.au>
On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini <nick@csse.uwa.edu.au> wrote:
Absolutely, which is why I would like to elevate space-separated list items to be correct syntax rather than 'wrong but intent is clear' syntax.
Re ease of translation: you speak as if a viable approach to a CIF data file is to take whole text chunks and throw them at some language interpreter, without doing your own parse. Quite apart from being a rather unlikely approach, this is impossible, as without parsing you won't know where the list finishes. If you do do your own parse, you can populate your datastructures directly during the parse, and what list separator was originally used in the data file is completely irrelevant.
Re complexity: not sure how you are planning to deal with whitespace in the formal grammar, but consider the following, where I have assumed that each token 'eats up' the following whitespace.
<dataitem> = <dataname><whitespace>+<datavalue>
<datavalue> = {<list>|<string>}<whitespace>+
<listdatavalue> = {<list>|<string>}<whitespace>*
<list> = '[' <whitespace>* {<listdatavalue> {<comma><whitespace>*<listdatavalue>}*}* ']'
If we make comma or whitespace possible separators, the last production becomes:
<list> = '[' <whitespace>* {<listdatavalue> {<comma or whitespace><listdatavalue>}*}* ']'
This looks like no extra complexity, and from a user's point of view whitespace as an alternative separator is simple to understand and consistent with space as a token separator used everywhere else in CIF. Anyway, if reduction of grammar complexity is your goal, you can just completely exclude commas as list separators!
Some questions about how commas behave:
1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
2. are two commas in a row a syntax error? E.g. [1,2,3,,4]
Note the above productions assume that the answer to both is yes.
Timely email, come in just after the one I sent.
My position is if we specify the syntax then we encourage its correct use but acknowledge that there may be cases where one might be able to recover intent. But I wouldn’t encourage those cases.
Absolutely, which is why I would like to elevate space-separated list items to be correct syntax rather than 'wrong but intent is clear' syntax.
You could say that token separator in lists are a or b or c, but that just adds a level of complexity for very little gain. The choice of comma makes it seamless to translate from the raw CIF data straight in to most language specific data declaration. The only language I know that accepts one or the other or both is MatLab.
Re ease of translation: you speak as if a viable approach to a CIF data file is to take whole text chunks and throw them at some language interpreter, without doing your own parse. Quite apart from being a rather unlikely approach, this is impossible, as without parsing you won't know where the list finishes. If you do do your own parse, you can populate your datastructures directly during the parse, and what list separator was originally used in the data file is completely irrelevant.
Re complexity: not sure how you are planning to deal with whitespace in the formal grammar, but consider the following, where I have assumed that each token 'eats up' the following whitespace.
<dataitem> = <dataname><whitespace>+<datavalue>
<datavalue> = {<list>|<string>}<whitespace>+
<listdatavalue> = {<list>|<string>}<whitespace>*
<list> = '[' <whitespace>* {<listdatavalue> {<comma><whitespace>*<listdatavalue>}*}* ']'
If we make comma or whitespace possible separators, the last production becomes:
<list> = '[' <whitespace>* {<listdatavalue> {<comma or whitespace><listdatavalue>}*}* ']'
This looks like no extra complexity, and from a user's point of view whitespace as an alternative separator is simple to understand and consistent with space as a token separator used everywhere else in CIF. Anyway, if reduction of grammar complexity is your goal, you can just completely exclude commas as list separators!
Some questions about how commas behave:
1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
2. are two commas in a row a syntax error? E.g. [1,2,3,,4]
Note the above productions assume that the answer to both is yes.
What big advantage to a language is there to specify you can use a comma or whitespace as a token separator? Will you be happy with the first person who interprets this as being ok
loop_
_severalvalues 1,2,3,4,5,6,7 # these being the 7 values of severalvalues
Note sure what you are getting at here: I am proposing the following:
_nicelist [1 2 3 4 5 6 7]
being the same as
_nicelist [1,2,3,4,5,6,7]
Don't see how this relates to loops.
James.
------
_nicelist [1 2 3 4 5 6 7]
being the same as
_nicelist [1,2,3,4,5,6,7]
Don't see how this relates to loops.
James.
------
Dear All: looking over the list I posted previously of items left to resolve, I see only one serious one outstanding: whether or not to allow space as a separator between list items. Nick has stated:
" I will propose it has to be a comma, but make the coercion rule that space
separated values in a list-type object be coerced into comma separated
values. That is, read spaces as you want, but don't encourage them."
I would like to counter-propose, as Joe did originally, that whitespace be elevated to equal status with comma as a valid list separator. I see no downside to this. Would anyone else like to speak to this issue before we vote? In particular, I would be interested to hear why Nick doesn't want to encourage spaces.
cheers
Nick
--------------------------------
Associate Professor N. Spadaccini, PhD
School of Computer Science & Software Engineering
The University of Western Australia t: +61 (0)8 6488 3452
35 Stirling Highway f: +61 (0)8 6488 1089
CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
MBDP M002
CRICOS Provider Code: 00126G
e: Nick.Spadaccini@uwa.edu.au
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group
--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Space as a list item separator (Nick Spadaccini)
- References:
- [ddlm-group] Space as a list item separator (James Hester)
- Re: [ddlm-group] Space as a list item separator (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] Space as a list item separator
- Next by Date: Re: [ddlm-group] Space as a list item separator
- Prev by thread: Re: [ddlm-group] Space as a list item separator
- Next by thread: Re: [ddlm-group] Space as a list item separator
- Index(es):