Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Space as a list item separator

James Hester wrote:
> It seems that most of us are in favour of spaces as separators, with 
> Herbert at least in favour of spaces and commas.  Some things to decide:
> 
> 1. Do we want spaces and commas, or only spaces?
> 2. Space syntax: while all primitive values should be separated from 
> neighbouring primitive values by spaces, what about compound values 
> (i.e. lists).  So for example, is
> 
> [[1 2 3][4 5 6]]
> 
> acceptable or should it be
> 
> [[1 2 3] [4 5 6]] ?
> 
> (I have added a space between the neighbouring lists in the second version).
I would exclude the whitespace rule from brackets, as if they had 
implicit zero-width whitespace on both sides. Otherwise, a strict 
whitespace requirement would imply that this example would need to be 
written as follows, with whitespace around all brackets:

[ [ 1 2 3 ] [ 4 5 6 ] ]

> 
> 3.  If commas are acceptable, we need to decide on the two cases that I 
> brought up recently: are multiple commas in a row acceptable (like 
> [1,,2,3])?  Are trailing commas acceptable - [1,2,3,]?  Herbert appears 
> to favour inferring a missing value in these cases, and Nick thinks they 
> should both be syntax errors.  I favour Nick's interpretation, and 
> Herbert's interpretations could then be coercion rules.  Of course, if 
> we drop commas altogether, this is a moot point.
I would interpret empty values the same as an empty string. For example, 
[1,,2,3] is the same as [1,"",2,3]. So, it may be an error depending on 
how "" is interpreted, but not an error at the parsing level.

Quotes are required for an empty string because it otherwise has no 
delimiters, which is not the case in the comma-delimited list. But, that 
is not exactly true if commas are allowed to have extra white space act 
as part of the delimiter, and not be part of the item value. For 
example, is [1,,2,3] the same as [1, , 2, 3] ?

What about white space in the middle of a list item value? Normally, 
such an item would require quotes, but it would not necessarily be 
needed in the context of commas. This is sort of the complementary case 
of a comma character in list versus non-list context.

My rationale for encouraging space-delimited lists is that it gives 
values within a list the same parsing syntax as any other series of 
values not in a list, and it has also been a proposed extension to STAR 
for some time. It just seems simpler because it is more uniform, but I 
don't know anything about implications for tags or dREL. Is it not 
possible to just transform a space-delimited CIF list into a 
comma-delimited tag with extra spaces removed?

Obviously, parsing comma delimited lists is not that complicated. It 
just means that extra care needs to be taken in defining rules for 
whitespace and quotation rules, which are necessarily different.

Joe Krahn
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.