Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Role of separators in CIF

My vote remains for allowing ' and " in non-delimited strings except,
of course, as the first character -- Herbert

  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769


On Wed, 2 Dec 2009, Nick Spadaccini wrote:

> msg00038 correctly states that the restriction of the character set for
> non-delimited strings is non-negotiable if we are to adopt the features of
> DDLm. That is a restriction the syntax has to be in place for delimited
> lists and tables to be possible.
> At the time (some two messages prior) the restriction I proposed was
> everything except for alphnumeric ascii and a few punctuation characters
> (because it was tightly aligned to the restrictions on characters for
> datanames). By the time that thread gets to msg00099 the restriction is only
> a few punctuation characters. Clearly people understood the
> non-negotiability had to do with the requirement of a restriction, but that
> the character set was negotiable - otherwise the 61 messages between
> msg00038 and msg00099 shouldn't have been possible.
> The restriction I proposed in msg00099 was all the terminator characters
> plus the token separators for lists and tables (those at that time, they
> have since evolved). The use of terminators in non-delimited strings can
> cause problems, especially when viewed or they are ambiguous. Can I
> construct lexing rules such that " and ' can be included in a non-delimited
> string so that is not be ambiguous or result in error - like Simon I suspect
> yes, I haven't thought of an example that systematically fails.
> Do I think it is sensible to restrict the terminator characters and
> separators except for two? No. I think a consistent rule that terminators
> and separators are All disallowed makes more sense and easier to articulate.
> For example the two cases below would be non-delimited strings.
> _quote   ."Hello"
> _quote ``Hello''
> However it would seem only JW and I have this point of view. So cast your
> final votes and lets get on with it. I think this will finish finally the
> syntactic issues.
> For the record (I think) the restriction I propose would be
> " ' : { } [ ] # commas are now returned to the allowed list
> Or if you vote "for", the restriction is
> : { } [ ]
> Nick
> On 1/12/09 6:55 PM, "Brian McMahon" <bm@iucr.org> wrote:
>> I want to vote "For" on this proposition, but I'm concerned by Nick's
>> assertion of 9 October
>> http://www.iucr.org/__data/iucr/lists/ddlm-group/msg00038.html
>>      (1) restricting the character set of non-delimited strings is
>>      NON-NEGOTIABLE. If we don't restrict it, then we can't build
>>      recursive data structures and exploit DDLm.
>> I understood this to be definitely ruling out the embedding of the
>> quote characters in non-delimited strings, but I've lost track of the
>> details of the subsequent discussions.
>> Regards
>> Brian
>> On Tue, Dec 01, 2009 at 04:56:52PM +1100, James Hester wrote:
>>> Simon: From reading your previous emails, I'm guessing that the source of
>>> your concern is that the possible characterset of non-delimited strings
>>> appears more restrictive than is strictly necessary.  In particular, you're
>>> not sure why we have excluded quote and double quote from non-delimited
>>> strings.
>>> You are correct that the other CIF2 syntax does not require that quote or
>>> double quote are excluded from non-delimited strings (apart from the first
>>> character, of course).  The exclusion of the quote/double quote was on
>>> general principle of keeping all characters that serve as delimiters out of
>>> non-delimited strings, even if those characters could never cause confusion.
>>> It also has the benefit of allowing some syntax errors to be picked up.
>>> Nick is with me in Sydney, and we have decided that this is the sort of
>>> issue that we just have to vote on, as the arguments either way are not
>>> conclusive.
>>> I would therefore call everybody to vote on the following proposition:
>>> "That <quote> and <double quote> may appear in non-delimited strings, as
>>> long as they are not the first character"
>>> Voting so far:
>>> Against: Nick
>>> For: James
>>> Agnostic: ?
>>> --
>>> T +61 (02) 9717 9907
>>> F +61 (02) 9717 3145
>>> M +61 (04) 0249 4148
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
> cheers
> Nick
> --------------------------------
> Associate Professor N. Spadaccini, PhD
> School of Computer Science & Software Engineering
> The University of Western Australia    t: +61 (0)8 6488 3452
> 35 Stirling Highway                    f: +61 (0)8 6488 1089
> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
> MBDP  M002
> CRICOS Provider Code: 00126G
> e: Nick.Spadaccini@uwa.edu.au
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.