[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Space as a list item separator
- To: Group finalising DDLm and associated dictionaries <[email protected]>
- Subject: Re: [ddlm-group] Space as a list item separator
- From: "Herbert J. Bernstein" <[email protected]>
- Date: Mon, 30 Nov 2009 08:36:24 -0500 (EST)
- In-Reply-To: <[email protected]>
- References: <C7398588.126B6%[email protected]><[email protected]><[email protected]><[email protected]>
Sorry something got lost in the prior message. It should have
read:
> Dear Colleagues,
>
> Back to the question of commas. If you accept the desirability of
> having a CIF 1.5, commas in lists become very useful. Someone with
> a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many
> useful cases by doing all lists with commas and no embedded blanks
> as long as they can make their lists fit on single lines. In CIF 1.1
>
> [[1,2,3],[4,5,6],[7,8,9]]
>
> is a valid value for a tag, but
>
> [[1 2 3] [4 5 6] [7 8 9]]
>
> is not.
>
> Having the option of commas in lists will help to smooth the
> transition for at least some people.
>
> Regards,
> Herbert
=====================================================
Herbert J. Bernstein, Professor of Computer Science
Dowling College, Kramer Science Center, KSC 121
Idle Hour Blvd, Oakdale, NY, 11769
+1-631-244-3035
[email protected]
=====================================================
On Mon, 30 Nov 2009, Herbert J. Bernstein wrote:
> Dear Colleagues,
>
> Back to the question of commas. If you accept the desirability of
> having a CIF 1.5, commas in lists become very useful. Someone with
> a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many
> useful cases by doing all lists with commas and no embedded blanks
> as long as they can make their lists fit on single lines. In CIF 1.1
>
> [[1,2,3],[4,5,6],[7,8,9]]
>
> is a valid value for a tag, but
>
> [[1 2 3] [4 5 6] [7 8 9]]
>
> Having the option of commas in lists will help to smooth the
> transition for at least some people.
>
> Regards,
> Herbert
>
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
> Dowling College, Kramer Science Center, KSC 121
> Idle Hour Blvd, Oakdale, NY, 11769
>
> +1-631-244-3035
> [email protected]
> =====================================================
>
> On Mon, 30 Nov 2009, SIMON WESTRIP wrote:
>
>> OK - one last bash at explaining my recent description of base CIF syntax
>> and the motivation
>> behind it.
>>
>> I'll start with the motivation:
>>
>> (1) reduce the restriction on the character set of non-delimited strings.
>>
>> In CIF2 the character set of nondelimited strings has been restricted to
>> disallow e.g. ' in O1' because this can lead to ambiguity in e.g. lists.
>> However, lists require a separator (say its a comma for the sake of
>> argument), so
>> O1' can be included in a list, e.g. [O1',O1',O1'].
>> The important point here is that we require separators
>> between tokens, so these separators have significance in parsing and
>> effectively terminate a nondelimited string.
>> Obviously, the separators cannot be part of a nondelimited string, which is
>> why
>> I specified a single separator in my recent description.
>>
>> (2) reduce the base syntax as far as possible to something that is readily
>> parsable
>> by both machine and human, and can be seen as set-in-stone so that we dont
>> have the same problems when going from CIF2 to CIF3 that we have in going
>> from CIF1 to
>> CIF2.
>>
>> I think we are all agreed that one of the aims in defining CIF2 is to
>> define
>> something
>> that will be the base for all future CIF versions, so there's nothing new
>> here.
>>
>> My particular approach to the description is irrelevant in many respects,
>> as CIF2 will be defined unambiguously.
>> However, it was an attempt to reconcile current CIF1 with CIF2 - e.g. using
>> the concept of
>> separators rather than delimiters that effectively include the separator in
>> their definition, and
>> describing everything in terms of delimited and nondelimited strings.
>>
>> Actually, I will not elaborate on my description further as the main point
>> in this message is given
>> in (1) above. I've been trying to find examples that break my assertion
>> that
>> a delimiter can be
>> contained in a nondelimited string as long as its not the first character -
>> perhaps
>> someone can put me out of my misery?
>>
>> Cheers
>>
>> Simon
>>
>>
>>
>>
>> ____________________________________________________________________________
>> From: SIMON WESTRIP <[email protected]>
>> To: [email protected]; Group finalising DDLm and associated
>> dictionaries <[email protected]>
>> Sent: Monday, 30 November, 2009 9:32:07
>> Subject: Re: [ddlm-group] Space as a list item separator
>>
>> Yes I agree - my wording "dropping the CIF1 syntax of requiring space after
>> data values" was simply careless here.
>>
>>
>>
>> ____________________________________________________________________________
>> From: Nick Spadaccini <[email protected]>
>> To: Group finalising DDLm and associated dictionaries <[email protected]>
>> Sent: Monday, 30 November, 2009 6:22:00
>> Subject: Re: [ddlm-group] Space as a list item separator
>>
>> James has already elaborated on this but for the record we have dropped
>> that
>> ?adelimiter character and one whitespace? is MANDATED to be the token
>> delimiter.
>> We still require a space as a token separator, it is not elevated to being
>> part of the delimiter. If not there as a separator we have ways of
>> recovering with coercion rules. Clearly a whitespace is necessary to
>> separate non-delimited strings because they have no delimiting character.
>>
>> This more consistent approach lead to grammar rules that were the same
>> whether tokens were inside the new compound data types of not.
>>
>> The previous discussions on this list elaborate on these points.
>>
>>
>> On 28/11/09 6:01 PM, "SIMON WESTRIP" <[email protected]> wrote:
>>
>> I had been under the assumption that the separation of list
>> items by a comma was 'set in stone'
>> (and was one reason for dropping the CIF1 syntax of requiring
>> space after data values),
>> but if its up for negotiation I would opt for using the space as
>> a separator as elsewhere in the CIF,
>> partly because then a list can essentially be treated much like
>> a single-item loop - i.e. same basic parsing
>> of <value><space><value><space>...
>>
>> Cheers
>>
>> Simon
>>
>> ____________________________________________________________________________
>> From: Herbert J. Bernstein <[email protected]>
>> To: Group finalising DDLm and associated dictionaries
>> <[email protected]>
>> Cc: [email protected]
>> Sent: Friday, 27 November, 2009 11:43:10
>> Subject: Re: [ddlm-group] Space as a list item separator
>>
>> Dear Colleagues,
>>
>> I have no objection to accepting either comma or whitespace
>> as a valid separator in a list. I can't object -- I have been
>> coding to that standard since 1997, and now would only have to
>> remove the message generated for the case of the space. We
>> already
>> accept multiple glyphs as valid separators at all levels:
>>
>> whitespace itself it one of several character sequences in
>> rather
>> complex combinations: any number of blanks, tabs, newlines and
>> comments.
>> The comma itself is handled in a complex way. We accept (or
>> should accept) any whitespace before and after a comma as valid,
>> as in
>> {a,b} versus {a , b }. Adding the option of leaving out the
>> comma
>> itself and just having the whitespace as the separator make just
>> as much sense.
>>
>> I see nothing to be gained by now forbidding the comma. The
>> meaning of {a,,b,} is the same as {a,.,b,.} or {a,?,b,?} or,
>> under this new (and I think more sensibsle and realistic
>> approach) {a . b .} or {a ? b ?}.
>>
>> The blank reads particularly well in dealing with vectors and
>> matrices. The comma reads well when dealing with strings.
>>
>> I think we would do best with both as valid alternatives (no
>> error, no warning for either one).
>>
>> Regards,
>> Herbert
>> =====================================================
>> Herbert J. Bernstein, Professor of Computer Science
>> Dowling College, Kramer Science Center, KSC 121
>> Idle Hour Blvd, Oakdale, NY, 11769
>>
>> +1-631-244-3035
>> [email protected]
>> =====================================================
>>
>> On Fri, 27 Nov 2009, SIMON WESTRIP wrote:
>>
>> > At first glance, you're considering using space instead of
>> commas as list
>> > separators?
>> > which is not so far away from the CIF1 requirement of space
>> following a
>> > delimiter?
>> >
>> > But I'm only on my first cup of coffee this morning :-)
>> >
>> >___________________________________________________________________________
>> _
>> > From: Nick Spadaccini <[email protected]>
>> > To: Group finalising DDLm and associated dictionaries
>> <[email protected]>
>> > Sent: Friday, 27 November, 2009 7:46:44
>> > Subject: Re: [ddlm-group] Space as a list item separator
>> >
>> >
>> >
>> >
>> > On 27/11/09 2:32 PM, "James Hester" <[email protected]>
>> wrote:
>> >
>> > > See comments below:
>> > >
>> > > On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini
>> <[email protected]>
>> > wrote:
>> > >> Timely email, come in just after the one I sent.
>> > >>
>> > >> My position is if we specify the syntax then we encourage
>> its correct use
>> > but
>> > >> acknowledge that there may be cases where one might be able
>> to recover
>> > >> intent. But I wouldn?t encourage those cases.
>> > >
>> > > Absolutely, which is why I would like to elevate
>> space-separated list
>> > items to
>> > > be correct syntax rather than 'wrong but intent is clear'
>> syntax.
>> > >>
>> > >> You could say that token separator in lists are a or b or
>> c, but that
>> > just
>> > >> adds a level of complexity for very little gain. The choice
>> of comma
>> > makes it
>> > >> seamless to translate from the raw CIF data straight in to
>> most language
>> > >> specific data declaration. The only language I know that
>> accepts one or
>> > the
>> > >> other or both is MatLab.
>> > >
>> > > Re ease of translation: you speak as if a viable approach to
>> a CIF data
>> > file
>> > > is to take whole text chunks and throw them at some language
>> interpreter,
>> > > without doing your own parse. Quite apart from being a
>> rather unlikely
>> > > approach, this is impossible, as without parsing you won't
>> know where the
>> > list
>> > > finishes. If you do do your own parse, you can populate
>> your
>> > datastructures
>> > > directly during the parse, and what list separator was
>> originally used in
>> > the
>> > > data file is completely irrelevant.
>> > >
>> > > Re complexity: not sure how you are planning to deal with
>> whitespace in
>> > the
>> > > formal grammar, but consider the following, where I have
>> assumed that each
>> > > token 'eats up' the following whitespace.
>> > >
>> > > <dataitem> = <dataname><whitespace>+<datavalue>
>> > > <datavalue> = {<list>|<string>}<whitespace>+
>> > > <listdatavalue> = {<list>|<string>}<whitespace>*
>> > > <list> = '[' <whitespace>* {<listdatavalue>
>> > > {<comma><whitespace>*<listdatavalue>}*}* ']'
>> > >
>> > > If we make comma or whitespace possible separators, the last
>> production
>> > > becomes:
>> > > <list> = '[' <whitespace>* {<listdatavalue> {<comma or
>> > > whitespace><listdatavalue>}*}* ']'
>> > >
>> > > This looks like no extra complexity, and from a user's point
>> of view
>> > > whitespace as an alternative separator is simple to
>> understand and
>> > consistent
>> > > with space as a token separator used everywhere else in CIF.
>> Anyway, if
>> > > reduction of grammar complexity is your goal, you can just
>> completely
>> > exclude
>> > > commas as list separators!
>> >
>> > Why not? Make them spaces only, and you become consistent
>> across the board.
>> > I have to think about the possibility of pathological cases
>> where spaces
>> > won't work. I can't think of any at the moment.
>> >
>> > >
>> > > Some questions about how commas behave:
>> > > 1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
>> > > 2. are two commas in a row a syntax error? E.g. [1,2,3,,4]
>> >
>> > I would say yes to syntax error. I an easily determine they
>> may need to be
>> > an additional list value, but can't determine what.
>> >
>> > > Note the above productions assume that the answer to both is
>> yes.
>> > >
>> > >>
>> > >> What big advantage to a language is there to specify you
>> can use a comma
>> > or
>> > >> whitespace as a token separator? Will you be happy with the
>> first person
>> > who
>> > >> interprets this as being ok
>> > >>
>> > >> loop_
>> > >> _severalvalues 1,2,3,4,5,6,7 # these being the 7 values
>> of
>> > severalvalues
>> > >>
>> > > Note sure what you are getting at here: I am proposing the
>> following:
>> > >
>> > > _nicelist [1 2 3 4 5 6 7]
>> > >
>> > > being the same as
>> > >
>> > > _nicelist [1,2,3,4,5,6,7]
>> > >
>> > > Don't see how this relates to loops.
>> >
>> > The point was, once you say a space and comma are equivalent
>> token
>> > separators then will it be an interpretation that they are
>> always so even in
>> > loops? My example was not a list, just 7 values that were
>> separated by
>> > commas not spaces.
>> >
>> > >
>> > > James.
>> > > ------
>> > >>
>> > >> On 27/11/09 11:41 AM, "James Hester"
>> <[email protected]
>> > >> <http://[email protected]> > wrote:
>> > >>
>> > >>> Dear All: looking over the list I posted previously of
>> items left to
>> > >>> resolve, I see only one serious one outstanding: whether
>> or not to allow
>> > >>> space as a separator between list items. Nick has stated:
>> > >>>
>> > >>> " I will propose it has to be a comma, but make the
>> coercion rule that
>> > space
>> > >>> separated values in a list-type object be coerced into
>> comma separated
>> > >>> values. That is, read spaces as you want, but don't
>> encourage them."
>> > >>>
>> > >>> I would like to counter-propose, as Joe did originally,
>> that whitespace
>> > be
>> > >>> elevated to equal status with comma as a valid list
>> separator. I see no
>> > >>> downside to this. Would anyone else like to speak to this
>> issue before
>> > we
>> > >>> vote? In particular, I would be interested to hear why
>> Nick doesn't
>> > want to
>> > >>> encourage spaces.
>> > >>
>> > >> cheers
>> > >>
>> > >> Nick
>> > >>
>> > >> --------------------------------
>> > >> Associate Professor N. Spadaccini, PhD
>> > >> School of Computer Science & Software Engineering
>> > >>
>> > >> The University of Western Australia t: +61 (0)8 6488
>> 3452
>> > >> 35 Stirling Highway f: +61 (0)8 6488
>> 1089
>> > >> CRAWLEY, Perth, WA 6009 AUSTRALIA w3:
>> www.csse.uwa.edu.au/~nick <http://www.csse.uwa.edu.au/%7Enick>
>> > >> <http://www.csse.uwa.edu.au/%7Enick>
>> > >> MBDP M002
>> > >>
>> > >> CRICOS Provider Code: 00126G
>> > >>
>> > >> e: [email protected]
>> <http://[email protected]>
>> > >>
>> > >>
>> > >>
>> > >> _______________________________________________
>> > >> ddlm-group mailing list
>> > >> [email protected]
>> > >> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> > >>
>> > >
>> > >
>> >
>> > cheers
>> >
>> > Nick
>> >
>> > --------------------------------
>> > Associate Professor N. Spadaccini, PhD
>> > School of Computer Science & Software Engineering
>> >
>> > The University of Western Australia t: +61 (0)8 6488 3452
>> > 35 Stirling Highway f: +61 (0)8 6488 1089
>> > CRAWLEY, Perth, WA 6009 AUSTRALIA w3:
>> www.csse.uwa.edu.au/~nick <http://www.csse.uwa.edu.au/%7Enick>
>> > MBDP M002
>> >
>> > CRICOS Provider Code: 00126G
>> >
>> > e: [email protected]
>> >
>> >
>> >
>> >
>> > _______________________________________________
>> > ddlm-group mailing list
>> > [email protected]
>> > http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> >
>> >
>>
>> ________________________________________________________________________
>> _______________________________________________
>> ddlm-group mailing list
>> [email protected]
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>>
>> cheers
>>
>> Nick
>>
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>>
>> The University of Western Australia t: +61 (0)8 6488 3452
>> 35 Stirling Highway f: +61 (0)8 6488 1089
>> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick
>> MBDP M002
>>
>> CRICOS Provider Code: 00126G
>>
>> e: [email protected]
>>
>>
>>
>
_______________________________________________ ddlm-group mailing list [email protected] http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Space as a list item separator (James Hester)
- References:
- Re: [ddlm-group] Space as a list item separator (Nick Spadaccini)
- Re: [ddlm-group] Space as a list item separator (SIMON WESTRIP)
- Re: [ddlm-group] Space as a list item separator (SIMON WESTRIP)
- Re: [ddlm-group] Space as a list item separator (Herbert J. Bernstein)
- Prev by Date: [ddlm-group] Stakeholders
- Next by Date: Re: [ddlm-group] Stakeholders
- Prev by thread: Re: [ddlm-group] Space as a list item separator
- Next by thread: Re: [ddlm-group] Space as a list item separator
- Index(es):

