[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Space as a list item separator
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Space as a list item separator
- From: "Herbert J. Bernstein" <yaya@bernstein-plus-sons.com>
- Date: Mon, 30 Nov 2009 08:36:24 -0500 (EST)
- In-Reply-To: <alpine.BSF.2.00.0911300822020.56763@epsilon.pair.com>
- References: <C7398588.126B6%nick@csse.uwa.edu.au><275884.79342.qm@web87006.mail.ird.yahoo.com><84104.25546.qm@web87002.mail.ird.yahoo.com><alpine.BSF.2.00.0911300822020.56763@epsilon.pair.com>
Sorry something got lost in the prior message. It should have read: > Dear Colleagues, > > Back to the question of commas. If you accept the desirability of > having a CIF 1.5, commas in lists become very useful. Someone with > a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many > useful cases by doing all lists with commas and no embedded blanks > as long as they can make their lists fit on single lines. In CIF 1.1 > > [[1,2,3],[4,5,6],[7,8,9]] > > is a valid value for a tag, but > > [[1 2 3] [4 5 6] [7 8 9]] > > is not. > > Having the option of commas in lists will help to smooth the > transition for at least some people. > > Regards, > Herbert ===================================================== Herbert J. Bernstein, Professor of Computer Science Dowling College, Kramer Science Center, KSC 121 Idle Hour Blvd, Oakdale, NY, 11769 +1-631-244-3035 yaya@dowling.edu ===================================================== On Mon, 30 Nov 2009, Herbert J. Bernstein wrote: > Dear Colleagues, > > Back to the question of commas. If you accept the desirability of > having a CIF 1.5, commas in lists become very useful. Someone with > a CIF 1.1 editor will be able to prepare a CIF 1.5 file for many > useful cases by doing all lists with commas and no embedded blanks > as long as they can make their lists fit on single lines. In CIF 1.1 > > [[1,2,3],[4,5,6],[7,8,9]] > > is a valid value for a tag, but > > [[1 2 3] [4 5 6] [7 8 9]] > > Having the option of commas in lists will help to smooth the > transition for at least some people. > > Regards, > Herbert > > ===================================================== > Herbert J. Bernstein, Professor of Computer Science > Dowling College, Kramer Science Center, KSC 121 > Idle Hour Blvd, Oakdale, NY, 11769 > > +1-631-244-3035 > yaya@dowling.edu > ===================================================== > > On Mon, 30 Nov 2009, SIMON WESTRIP wrote: > >> OK - one last bash at explaining my recent description of base CIF syntax >> and the motivation >> behind it. >> >> I'll start with the motivation: >> >> (1) reduce the restriction on the character set of non-delimited strings. >> >> In CIF2 the character set of nondelimited strings has been restricted to >> disallow e.g. ' in O1' because this can lead to ambiguity in e.g. lists. >> However, lists require a separator (say its a comma for the sake of >> argument), so >> O1' can be included in a list, e.g. [O1',O1',O1']. >> The important point here is that we require separators >> between tokens, so these separators have significance in parsing and >> effectively terminate a nondelimited string. >> Obviously, the separators cannot be part of a nondelimited string, which is >> why >> I specified a single separator in my recent description. >> >> (2) reduce the base syntax as far as possible to something that is readily >> parsable >> by both machine and human, and can be seen as set-in-stone so that we dont >> have the same problems when going from CIF2 to CIF3 that we have in going >> from CIF1 to >> CIF2. >> >> I think we are all agreed that one of the aims in defining CIF2 is to >> define >> something >> that will be the base for all future CIF versions, so there's nothing new >> here. >> >> My particular approach to the description is irrelevant in many respects, >> as CIF2 will be defined unambiguously. >> However, it was an attempt to reconcile current CIF1 with CIF2 - e.g. using >> the concept of >> separators rather than delimiters that effectively include the separator in >> their definition, and >> describing everything in terms of delimited and nondelimited strings. >> >> Actually, I will not elaborate on my description further as the main point >> in this message is given >> in (1) above. I've been trying to find examples that break my assertion >> that >> a delimiter can be >> contained in a nondelimited string as long as its not the first character - >> perhaps >> someone can put me out of my misery? >> >> Cheers >> >> Simon >> >> >> >> >> ____________________________________________________________________________ >> From: SIMON WESTRIP <simonwestrip@btinternet.com> >> To: Nick.Spadaccini@uwa.edu.au; Group finalising DDLm and associated >> dictionaries <ddlm-group@iucr.org> >> Sent: Monday, 30 November, 2009 9:32:07 >> Subject: Re: [ddlm-group] Space as a list item separator >> >> Yes I agree - my wording "dropping the CIF1 syntax of requiring space after >> data values" was simply careless here. >> >> >> >> ____________________________________________________________________________ >> From: Nick Spadaccini <nick@csse.uwa.edu.au> >> To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org> >> Sent: Monday, 30 November, 2009 6:22:00 >> Subject: Re: [ddlm-group] Space as a list item separator >> >> James has already elaborated on this but for the record we have dropped >> that >> ?adelimiter character and one whitespace? is MANDATED to be the token >> delimiter. >> We still require a space as a token separator, it is not elevated to being >> part of the delimiter. If not there as a separator we have ways of >> recovering with coercion rules. Clearly a whitespace is necessary to >> separate non-delimited strings because they have no delimiting character. >> >> This more consistent approach lead to grammar rules that were the same >> whether tokens were inside the new compound data types of not. >> >> The previous discussions on this list elaborate on these points. >> >> >> On 28/11/09 6:01 PM, "SIMON WESTRIP" <simonwestrip@btinternet.com> wrote: >> >> I had been under the assumption that the separation of list >> items by a comma was 'set in stone' >> (and was one reason for dropping the CIF1 syntax of requiring >> space after data values), >> but if its up for negotiation I would opt for using the space as >> a separator as elsewhere in the CIF, >> partly because then a list can essentially be treated much like >> a single-item loop - i.e. same basic parsing >> of <value><space><value><space>... >> >> Cheers >> >> Simon >> >> ____________________________________________________________________________ >> From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com> >> To: Group finalising DDLm and associated dictionaries >> <ddlm-group@iucr.org> >> Cc: Nick.Spadaccini@uwa.edu.au >> Sent: Friday, 27 November, 2009 11:43:10 >> Subject: Re: [ddlm-group] Space as a list item separator >> >> Dear Colleagues, >> >> I have no objection to accepting either comma or whitespace >> as a valid separator in a list. I can't object -- I have been >> coding to that standard since 1997, and now would only have to >> remove the message generated for the case of the space. We >> already >> accept multiple glyphs as valid separators at all levels: >> >> whitespace itself it one of several character sequences in >> rather >> complex combinations: any number of blanks, tabs, newlines and >> comments. >> The comma itself is handled in a complex way. We accept (or >> should accept) any whitespace before and after a comma as valid, >> as in >> {a,b} versus {a , b }. Adding the option of leaving out the >> comma >> itself and just having the whitespace as the separator make just >> as much sense. >> >> I see nothing to be gained by now forbidding the comma. The >> meaning of {a,,b,} is the same as {a,.,b,.} or {a,?,b,?} or, >> under this new (and I think more sensibsle and realistic >> approach) {a . b .} or {a ? b ?}. >> >> The blank reads particularly well in dealing with vectors and >> matrices. The comma reads well when dealing with strings. >> >> I think we would do best with both as valid alternatives (no >> error, no warning for either one). >> >> Regards, >> Herbert >> ===================================================== >> Herbert J. Bernstein, Professor of Computer Science >> Dowling College, Kramer Science Center, KSC 121 >> Idle Hour Blvd, Oakdale, NY, 11769 >> >> +1-631-244-3035 >> yaya@dowling.edu >> ===================================================== >> >> On Fri, 27 Nov 2009, SIMON WESTRIP wrote: >> >> > At first glance, you're considering using space instead of >> commas as list >> > separators? >> > which is not so far away from the CIF1 requirement of space >> following a >> > delimiter? >> > >> > But I'm only on my first cup of coffee this morning :-) >> > >> >___________________________________________________________________________ >> _ >> > From: Nick Spadaccini <nick@csse.uwa.edu.au> >> > To: Group finalising DDLm and associated dictionaries >> <ddlm-group@iucr.org> >> > Sent: Friday, 27 November, 2009 7:46:44 >> > Subject: Re: [ddlm-group] Space as a list item separator >> > >> > >> > >> > >> > On 27/11/09 2:32 PM, "James Hester" <jamesrhester@gmail.com> >> wrote: >> > >> > > See comments below: >> > > >> > > On Fri, Nov 27, 2009 at 3:09 PM, Nick Spadaccini >> <nick@csse.uwa.edu.au> >> > wrote: >> > >> Timely email, come in just after the one I sent. >> > >> >> > >> My position is if we specify the syntax then we encourage >> its correct use >> > but >> > >> acknowledge that there may be cases where one might be able >> to recover >> > >> intent. But I wouldn?t encourage those cases. >> > > >> > > Absolutely, which is why I would like to elevate >> space-separated list >> > items to >> > > be correct syntax rather than 'wrong but intent is clear' >> syntax. >> > >> >> > >> You could say that token separator in lists are a or b or >> c, but that >> > just >> > >> adds a level of complexity for very little gain. The choice >> of comma >> > makes it >> > >> seamless to translate from the raw CIF data straight in to >> most language >> > >> specific data declaration. The only language I know that >> accepts one or >> > the >> > >> other or both is MatLab. >> > > >> > > Re ease of translation: you speak as if a viable approach to >> a CIF data >> > file >> > > is to take whole text chunks and throw them at some language >> interpreter, >> > > without doing your own parse. Quite apart from being a >> rather unlikely >> > > approach, this is impossible, as without parsing you won't >> know where the >> > list >> > > finishes. If you do do your own parse, you can populate >> your >> > datastructures >> > > directly during the parse, and what list separator was >> originally used in >> > the >> > > data file is completely irrelevant. >> > > >> > > Re complexity: not sure how you are planning to deal with >> whitespace in >> > the >> > > formal grammar, but consider the following, where I have >> assumed that each >> > > token 'eats up' the following whitespace. >> > > >> > > <dataitem> = <dataname><whitespace>+<datavalue> >> > > <datavalue> = {<list>|<string>}<whitespace>+ >> > > <listdatavalue> = {<list>|<string>}<whitespace>* >> > > <list> = '[' <whitespace>* {<listdatavalue> >> > > {<comma><whitespace>*<listdatavalue>}*}* ']' >> > > >> > > If we make comma or whitespace possible separators, the last >> production >> > > becomes: >> > > <list> = '[' <whitespace>* {<listdatavalue> {<comma or >> > > whitespace><listdatavalue>}*}* ']' >> > > >> > > This looks like no extra complexity, and from a user's point >> of view >> > > whitespace as an alternative separator is simple to >> understand and >> > consistent >> > > with space as a token separator used everywhere else in CIF. >> Anyway, if >> > > reduction of grammar complexity is your goal, you can just >> completely >> > exclude >> > > commas as list separators! >> > >> > Why not? Make them spaces only, and you become consistent >> across the board. >> > I have to think about the possibility of pathological cases >> where spaces >> > won't work. I can't think of any at the moment. >> > >> > > >> > > Some questions about how commas behave: >> > > 1: is a trailing comma e.g. [1,2,3,4,] a syntax error? >> > > 2. are two commas in a row a syntax error? E.g. [1,2,3,,4] >> > >> > I would say yes to syntax error. I an easily determine they >> may need to be >> > an additional list value, but can't determine what. >> > >> > > Note the above productions assume that the answer to both is >> yes. >> > > >> > >> >> > >> What big advantage to a language is there to specify you >> can use a comma >> > or >> > >> whitespace as a token separator? Will you be happy with the >> first person >> > who >> > >> interprets this as being ok >> > >> >> > >> loop_ >> > >> _severalvalues 1,2,3,4,5,6,7 # these being the 7 values >> of >> > severalvalues >> > >> >> > > Note sure what you are getting at here: I am proposing the >> following: >> > > >> > > _nicelist [1 2 3 4 5 6 7] >> > > >> > > being the same as >> > > >> > > _nicelist [1,2,3,4,5,6,7] >> > > >> > > Don't see how this relates to loops. >> > >> > The point was, once you say a space and comma are equivalent >> token >> > separators then will it be an interpretation that they are >> always so even in >> > loops? My example was not a list, just 7 values that were >> separated by >> > commas not spaces. >> > >> > > >> > > James. >> > > ------ >> > >> >> > >> On 27/11/09 11:41 AM, "James Hester" >> <jamesrhester@gmail.com >> > >> <http://jamesrhester@gmail.com> > wrote: >> > >> >> > >>> Dear All: looking over the list I posted previously of >> items left to >> > >>> resolve, I see only one serious one outstanding: whether >> or not to allow >> > >>> space as a separator between list items. Nick has stated: >> > >>> >> > >>> " I will propose it has to be a comma, but make the >> coercion rule that >> > space >> > >>> separated values in a list-type object be coerced into >> comma separated >> > >>> values. That is, read spaces as you want, but don't >> encourage them." >> > >>> >> > >>> I would like to counter-propose, as Joe did originally, >> that whitespace >> > be >> > >>> elevated to equal status with comma as a valid list >> separator. I see no >> > >>> downside to this. Would anyone else like to speak to this >> issue before >> > we >> > >>> vote? In particular, I would be interested to hear why >> Nick doesn't >> > want to >> > >>> encourage spaces. >> > >> >> > >> cheers >> > >> >> > >> Nick >> > >> >> > >> -------------------------------- >> > >> Associate Professor N. Spadaccini, PhD >> > >> School of Computer Science & Software Engineering >> > >> >> > >> The University of Western Australia t: +61 (0)8 6488 >> 3452 >> > >> 35 Stirling Highway f: +61 (0)8 6488 >> 1089 >> > >> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: >> www.csse.uwa.edu.au/~nick <http://www.csse.uwa.edu.au/%7Enick> >> > >> <http://www.csse.uwa.edu.au/%7Enick> >> > >> MBDP M002 >> > >> >> > >> CRICOS Provider Code: 00126G >> > >> >> > >> e: Nick.Spadaccini@uwa.edu.au >> <http://Nick.Spadaccini@uwa.edu.au> >> > >> >> > >> >> > >> >> > >> _______________________________________________ >> > >> ddlm-group mailing list >> > >> ddlm-group@iucr.org >> > >> http://scripts.iucr.org/mailman/listinfo/ddlm-group >> > >> >> > > >> > > >> > >> > cheers >> > >> > Nick >> > >> > -------------------------------- >> > Associate Professor N. Spadaccini, PhD >> > School of Computer Science & Software Engineering >> > >> > The University of Western Australia t: +61 (0)8 6488 3452 >> > 35 Stirling Highway f: +61 (0)8 6488 1089 >> > CRAWLEY, Perth, WA 6009 AUSTRALIA w3: >> www.csse.uwa.edu.au/~nick <http://www.csse.uwa.edu.au/%7Enick> >> > MBDP M002 >> > >> > CRICOS Provider Code: 00126G >> > >> > e: Nick.Spadaccini@uwa.edu.au >> > >> > >> > >> > >> > _______________________________________________ >> > ddlm-group mailing list >> > ddlm-group@iucr.org >> > http://scripts.iucr.org/mailman/listinfo/ddlm-group >> > >> > >> >> ________________________________________________________________________ >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group >> >> >> cheers >> >> Nick >> >> -------------------------------- >> Associate Professor N. Spadaccini, PhD >> School of Computer Science & Software Engineering >> >> The University of Western Australia t: +61 (0)8 6488 3452 >> 35 Stirling Highway f: +61 (0)8 6488 1089 >> CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick >> MBDP M002 >> >> CRICOS Provider Code: 00126G >> >> e: Nick.Spadaccini@uwa.edu.au >> >> >> >
_______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Space as a list item separator (James Hester)
- References:
- Re: [ddlm-group] Space as a list item separator (Nick Spadaccini)
- Re: [ddlm-group] Space as a list item separator (SIMON WESTRIP)
- Re: [ddlm-group] Space as a list item separator (SIMON WESTRIP)
- Re: [ddlm-group] Space as a list item separator (Herbert J. Bernstein)
- Prev by Date: [ddlm-group] Stakeholders
- Next by Date: Re: [ddlm-group] Stakeholders
- Prev by thread: Re: [ddlm-group] Space as a list item separator
- Next by thread: Re: [ddlm-group] Space as a list item separator
- Index(es):