[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Space as a list item separator

Sorry something got lost in the prior message.  It should have
read:

> Dear Colleagues,
>
>  Back to the question of commas.  If you accept the desirability of
> having a CIF 1.5, commas in lists become very useful.  Someone with
> a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many
> useful cases by doing all lists with commas and no embedded blanks
> as long as they can make their lists fit on single lines.  In CIF 1.1
>
> [[1,2,3],[4,5,6],[7,8,9]]
>
> is a valid value for a tag, but
>
> [[1 2 3] [4 5 6] [7 8 9]]
>
> is not.
>
> Having the option of commas in lists will help to smooth the
> transition for at least some people.
>
> Regards,
>  Herbert

=====================================================
  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769

                  +1-631-244-3035
                  yaya@dowling.edu
=====================================================

On Mon, 30 Nov 2009, Herbert J. Bernstein wrote:

> Dear Colleagues,
>
>  Back to the question of commas.  If you accept the desirability of
> having a CIF 1.5, commas in lists become very useful.  Someone with
> a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many
> useful cases by doing all lists with commas and no embedded blanks
> as long as they can make their lists fit on single lines.  In CIF 1.1
>
> [[1,2,3],[4,5,6],[7,8,9]]
>
> is a valid value for a tag, but
>
> [[1 2 3] [4 5 6] [7 8 9]]
>
> Having the option of commas in lists will help to smooth the
> transition for at least some people.
>
> Regards,
>  Herbert
>
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>        Idle Hour Blvd, Oakdale, NY, 11769
>
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
>
> On Mon, 30 Nov 2009, SIMON WESTRIP wrote:
>
>> OK - one last bash at explaining my recent description of base CIF syntax
>> and the motivation
>> behind it.
>> 
>> I'll start with the motivation:
>> 
>> (1) reduce the restriction on the character set of non-delimited strings.
>> 
>> In CIF2 the character set of nondelimited strings has been restricted to
>> disallow e.g. ' in O1' because this can lead to ambiguity in e.g. lists.
>> However, lists require a separator (say its a comma for the sake of
>> argument), so
>> O1' can be included in a list, e.g. [O1',O1',O1'].
>> The important point here is that we require separators
>> between tokens, so these separators have significance in parsing and
>> effectively terminate a nondelimited string.
>> Obviously, the separators cannot be part of a nondelimited string, which is
>> why
>> I specified a single separator in my recent description.
>> 
>> (2) reduce the base syntax as far as possible to something that is readily
>> parsable
>> by both machine and human, and can be seen as set-in-stone so that we dont
>> have the same problems when going from CIF2 to CIF3 that we have in going
>> from CIF1 to
>> CIF2.
>> 
>> I think we are all agreed that one of the aims in defining CIF2 is to 
>> define
>> something
>> that will be the base for all future CIF versions, so there's nothing new
>> here.
>> 
>> My particular approach to the description is irrelevant in many respects,
>> as CIF2 will be defined unambiguously.
>> However, it was an attempt to reconcile current CIF1 with CIF2 - e.g. using
>> the concept of
>> separators rather than delimiters that effectively include the separator in
>> their definition, and
>> describing everything in terms of delimited and nondelimited strings.
>> 
>> Actually, I will not elaborate on my description further as the main point
>> in this message is given
>> in (1) above. I've been trying to find examples that break my assertion 
>> that
>> a delimiter can be
>> contained in a nondelimited string as long as its not the first character -
>> perhaps
>> someone can put me out of my misery?
>> 
>> Cheers
>> 
>> Simon
>> 
>> 
>> 
>> 
>> ____________________________________________________________________________
>> From: SIMON WESTRIP <simonwestrip@btinternet.com>
>> To: Nick.Spadaccini@uwa.edu.au; Group finalising DDLm and associated
>> dictionaries <ddlm-group@iucr.org>
>> Sent: Monday, 30 November, 2009 9:32:07
>> Subject: Re: [ddlm-group] Space as a list item separator
>> 
>> Yes I agree - my wording "dropping the CIF1 syntax of requiring space after
>> data values" was simply careless here.
>> 
>> 
>> 
>> ____________________________________________________________________________
>> From: Nick Spadaccini <nick@csse.uwa.edu.au>
>> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
>> Sent: Monday, 30 November, 2009 6:22:00
>> Subject: Re: [ddlm-group] Space as a list item separator
>> 
>> James has already elaborated on this but for the record we have dropped 
>> that
>> ?adelimiter character and one whitespace? is MANDATED to be the token 
>> delimiter.
>> We still require a space as a token separator, it is not elevated to being
>> part of the delimiter. If not there as a separator we have ways of
>> recovering with coercion rules. Clearly a whitespace is necessary to
>> separate non-delimited strings because they have no delimiting character.
>> 
>> This more consistent approach lead to grammar rules that were the same
>> whether tokens were inside the new compound data types of not.
>> 
>> The previous discussions on this list elaborate on these points.
>> 
>> 
>> On 28/11/09 6:01 PM, "SIMON WESTRIP" <simonwestrip@btinternet.com> wrote:
>>
>>       I had been under the assumption that the separation of list
>>       items by a comma was 'set in stone'
>>       (and was one reason for dropping the CIF1 syntax of requiring
>>       space after data values),
>>       but if its up for negotiation I would opt for using the space as
>>       a separator as elsewhere in the CIF,
>>       partly because then a list can essentially be treated much like
>>       a single-item loop - i.e. same basic parsing
>>       of <value><space><value><space>...
>>
>>       Cheers
>>
>>       Simon
>> 
>> ____________________________________________________________________________
>>       From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
>>       To: Group finalising DDLm and associated dictionaries
>>       <ddlm-group@iucr.org>
>>       Cc: Nick.Spadaccini@uwa.edu.au
>>       Sent: Friday, 27 November, 2009 11:43:10
>>       Subject: Re: [ddlm-group] Space as a list item separator
>>
>>       Dear Colleagues,
>>
>>          I have no objection to accepting either comma or whitespace
>>       as a valid separator in a list.  I can't object -- I have been
>>       coding to that standard since 1997, and now would only have to
>>       remove the message generated for the case of the space.  We
>>       already
>>       accept multiple glyphs as valid separators at all levels:
>>
>>         whitespace itself it one of several character sequences in
>>       rather
>>       complex combinations:  any number of blanks, tabs, newlines and
>>       comments.
>>       The comma itself is handled in a complex way.  We accept (or
>>       should accept) any whitespace before and after a comma as valid,
>>       as in
>>       {a,b} versus {a , b }.  Adding the option of leaving out the
>>       comma
>>       itself and just having the whitespace as the separator make just
>>       as much sense.
>>
>>         I see nothing to be gained by now forbidding the comma.  The
>>       meaning of {a,,b,} is the same as {a,.,b,.} or {a,?,b,?} or,
>>       under this new (and I think more sensibsle and realistic
>>       approach) {a . b .} or {a ? b ?}.
>>
>>         The blank reads particularly well in dealing with vectors and
>>       matrices. The comma reads well when dealing with strings.
>>
>>         I think we would do best with both as valid alternatives (no
>>       error, no warning for either one).
>>
>>         Regards,
>>           Herbert
>>       =====================================================
>>        Herbert J. Bernstein, Professor of Computer Science
>>          Dowling College, Kramer Science Center, KSC 121
>>               Idle Hour Blvd, Oakdale, NY, 11769
>>
>>                        +1-631-244-3035
>>                        yaya@dowling.edu
>>       =====================================================
>>
>>       On Fri, 27 Nov 2009, SIMON WESTRIP wrote:
>>
>>       > At first glance, you're considering using space instead of
>>       commas as list
>>       > separators?
>>       > which is not so far away from the CIF1 requirement of space
>>       following a
>>       > delimiter?
>>       >
>>       > But I'm only on my first cup of coffee this morning :-)
>>       >
>>       >___________________________________________________________________________
>>       _
>>       > From: Nick Spadaccini <nick@csse.uwa.edu.au>
>>       > To: Group finalising DDLm and associated dictionaries
>>       <ddlm-group@iucr.org>
>>       > Sent: Friday, 27 November, 2009 7:46:44
>>       > Subject: Re: [ddlm-group] Space as a list item separator
>>       >
>>       >
>>       >
>>       >
>>       > On 27/11/09 2:32 PM, "James Hester" <jamesrhester@gmail.com>
>>       wrote:
>>       >
>>       > > See comments below:
>>       > >
>>       > > On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini
>>       <nick@csse.uwa.edu.au>
>>       > wrote:
>>       > >> Timely email, come in just after the one I sent.
>>       > >>
>>       > >> My position is if we specify the syntax then we encourage
>>       its correct use
>>       > but
>>       > >> acknowledge that there may be cases where one might be able
>>       to recover
>>       > >> intent. But I wouldn?t encourage those cases.
>>       > >
>>       > > Absolutely, which is why I would like to elevate
>>       space-separated list
>>       > items to
>>       > > be correct syntax rather than 'wrong but intent is clear'
>>       syntax.
>>       > >>
>>       > >> You could say that token separator in lists are a or b or
>>       c, but that
>>       > just
>>       > >> adds a level of complexity for very little gain. The choice
>>       of comma
>>       > makes it
>>       > >> seamless to translate from the raw CIF data straight in to
>>       most language
>>       > >> specific data declaration. The only language I know that
>>       accepts one or
>>       > the
>>       > >> other or both is MatLab.
>>       > >
>>       > > Re ease of translation: you speak as if a viable approach to
>>       a CIF data
>>       > file
>>       > > is to take whole text chunks and throw them at some language
>>       interpreter,
>>       > > without doing your own parse.  Quite apart from being a
>>       rather unlikely
>>       > > approach, this is impossible, as without parsing you won't
>>       know where the
>>       > list
>>       > > finishes.  If you do do your own parse, you can populate
>>       your
>>       > datastructures
>>       > > directly during the parse, and what list separator was
>>       originally used in
>>       > the
>>       > > data file is completely irrelevant.
>>       > >
>>       > > Re complexity: not sure how you are planning to deal with
>>       whitespace in
>>       > the
>>       > > formal grammar, but consider the following, where I have
>>       assumed that each
>>       > > token 'eats up' the following whitespace.
>>       > >
>>       > > <dataitem> = <dataname><whitespace>+<datavalue>
>>       > > <datavalue> = {<list>|<string>}<whitespace>+
>>       > > <listdatavalue> = {<list>|<string>}<whitespace>*
>>       > > <list> = '[' <whitespace>* {<listdatavalue>
>>       > > {<comma><whitespace>*<listdatavalue>}*}* ']'
>>       > >
>>       > > If we make comma or whitespace possible separators, the last
>>       production
>>       > > becomes:
>>       > > <list> =  '[' <whitespace>* {<listdatavalue> {<comma or
>>       > > whitespace><listdatavalue>}*}* ']'
>>       > >
>>       > > This looks like no extra complexity, and from a user's point
>>       of view
>>       > > whitespace as an alternative separator is simple to
>>       understand and
>>       > consistent
>>       > > with space as a token separator used everywhere else in CIF.
>>        Anyway, if
>>       > > reduction of grammar complexity is your goal, you can just
>>       completely
>>       > exclude
>>       > > commas as list separators!
>>       >
>>       > Why not? Make them spaces only, and you become consistent
>>       across the board.
>>       > I have to think about the possibility of pathological cases
>>       where spaces
>>       > won't work. I can't think of any at the moment.
>>       >
>>       > >
>>       > > Some questions about how commas behave:
>>       > > 1: is a trailing comma e.g. [1,2,3,4,] a syntax error?
>>       > > 2. are two commas in a row a syntax error? E.g. [1,2,3,,4]
>>       >
>>       > I would say yes to syntax error. I an easily determine they
>>       may need to be
>>       > an additional list value, but can't determine what.
>>       >
>>       > > Note the above productions assume that the answer to both is
>>       yes.
>>       > >
>>       > >>
>>       > >> What big advantage to a language is there to specify you
>>       can use a comma
>>       > or
>>       > >> whitespace as a token separator? Will you be happy with the
>>       first person
>>       > who
>>       > >> interprets this as being ok
>>       > >>
>>       > >> loop_
>>       > >>   _severalvalues 1,2,3,4,5,6,7 # these being the 7 values
>>       of
>>       > severalvalues
>>       > >>
>>       > > Note sure what you are getting at here: I am proposing the
>>       following:
>>       > >
>>       > > _nicelist      [1 2 3 4 5 6 7]
>>       > >
>>       > > being the same as
>>       > >
>>       > > _nicelist      [1,2,3,4,5,6,7]
>>       > >
>>       > >  Don't see how this relates to loops.
>>       >
>>       > The point was, once you say a space and comma are equivalent
>>       token
>>       > separators then will it be an interpretation that they are
>>       always so even in
>>       > loops? My example was not a list, just 7 values that were
>>       separated by
>>       > commas not spaces.
>>       >
>>       > >
>>       > > James.
>>       > > ------
>>       > >>
>>       > >> On 27/11/09 11:41 AM, "James Hester"
>>       <jamesrhester@gmail.com
>>       > >> <http://jamesrhester@gmail.com> > wrote:
>>       > >>
>>       > >>> Dear All: looking over the list I posted previously of
>>       items left to
>>       > >>> resolve, I see only one serious one outstanding: whether
>>       or not to allow
>>       > >>> space as a separator between list items.  Nick has stated:
>>       > >>>
>>       > >>> " I will propose it has to be a comma, but make the
>>       coercion rule that
>>       > space
>>       > >>> separated values in a list-type object be coerced into
>>       comma separated
>>       > >>> values. That is, read spaces as you want, but don't
>>       encourage them."
>>       > >>>
>>       > >>> I would like to counter-propose, as Joe did originally,
>>       that whitespace
>>       > be
>>       > >>> elevated to equal status with comma as a valid list
>>       separator.  I see no
>>       > >>> downside to this.  Would anyone else like to speak to this
>>       issue before
>>       > we
>>       > >>> vote?  In particular, I would be interested to hear why
>>       Nick doesn't
>>       > want to
>>       > >>> encourage spaces.
>>       > >>
>>       > >> cheers
>>       > >>
>>       > >> Nick
>>       > >>
>>       > >> --------------------------------
>>       > >> Associate Professor N. Spadaccini, PhD
>>       > >> School of Computer Science & Software Engineering
>>       > >>
>>       > >> The University of Western Australia    t: +61 (0)8 6488
>>       3452
>>       > >> 35 Stirling Highway                    f: +61 (0)8 6488
>>       1089
>>       > >> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3:
>>       www.csse.uwa.edu.au/~nick <http://www.csse.uwa.edu.au/%7Enick>
>>       > >> <http://www.csse.uwa.edu.au/%7Enick>
>>       > >> MBDP  M002
>>       > >>
>>       > >> CRICOS Provider Code: 00126G
>>       > >>
>>       > >> e: Nick.Spadaccini@uwa.edu.au
>>       <http://Nick.Spadaccini@uwa.edu.au>
>>       > >>
>>       > >>
>>       > >>
>>       > >> _______________________________________________
>>       > >> ddlm-group mailing list
>>       > >> ddlm-group@iucr.org
>>       > >> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>       > >>
>>       > >
>>       > >
>>       >
>>       > cheers
>>       >
>>       > Nick
>>       >
>>       > --------------------------------
>>       > Associate Professor N. Spadaccini, PhD
>>       > School of Computer Science & Software Engineering
>>       >
>>       > The University of Western Australia    t: +61 (0)8 6488 3452
>>       > 35 Stirling Highway                    f: +61 (0)8 6488 1089
>>       > CRAWLEY, Perth,  WA  6009 AUSTRALIA  w3:
>>       www.csse.uwa.edu.au/~nick <http://www.csse.uwa.edu.au/%7Enick>
>>       > MBDP  M002
>>       >
>>       > CRICOS Provider Code: 00126G
>>       >
>>       > e: Nick.Spadaccini@uwa.edu.au
>>       >
>>       >
>>       >
>>       >
>>       > _______________________________________________
>>       > ddlm-group mailing list
>>       > ddlm-group@iucr.org
>>       > http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>       >
>>       >
>>
>>   ________________________________________________________________________
>>       _______________________________________________
>>       ddlm-group mailing list
>>       ddlm-group@iucr.org
>>       http://scripts.iucr.org/mailman/listinfo/ddlm-group
>> 
>> 
>> cheers
>> 
>> Nick
>> 
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>> 
>> The University of Western Australia    t: +61 (0)8 6488 3452
>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
>> MBDP  M002
>> 
>> CRICOS Provider Code: 00126G
>> 
>> e: Nick.Spadaccini@uwa.edu.au
>> 
>> 
>> 
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]