[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Summary of proposed CIF syntax changes
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Summary of proposed CIF syntax changes
- From: Nick Spadaccini <nick@csse.uwa.edu.au>
- Date: Wed, 09 Dec 2009 13:15:50 +0800
- Authentication-Results: postfix;
- In-Reply-To: <279aad2a0912051813p27ea25cdre57fe284efc81358@mail.gmail.com>
On 6/12/09 10:13 AM, "James Hester" <jamesrhester@gmail.com> wrote: > On Sat, Dec 5, 2009 at 9:45 AM, Joe Krahn <krahn@niehs.nih.gov> wrote: >> Semicolon and triple-quote strings do not emphasize that they cannot >> contain embedded close-quotes, as done for single quotes. > > That is, they cannot contain embedded triple quotes/embedded > newline-semicolons. Correct. The wording in the document (pdf) Brian posted for me makes this clear for all delimited strings, since the first subsequent terminating character sequence delimits the token. Hence after the initialising """, the next instance of """ terminates the string, so by definition it cannot contain embedded """. Same for all the other string types. >> In change 9, this sentence is hard to understand: "That does NOT require >> that whitespace is necessary between the beginning of one token and the >> beginning of the next token...". the main problem is that "token" is not >> defined. I the example "[[1 2 3] [4 5 6]]" does each inner list count as >> a token when parsing the outer list, and the initial '[' does not? Maybe >> describe it as: whitespace is required between all values within a list >> or table, but not between the values and the begin/end token. >> >> Was it decided that "[[1 2 3][4 5 6]]" is not allowed? > > Yes, we were looking for a concise expression that encompassed the following > cases: > > 1. [[1 2 3][4 5 6]] is allowed and is equivalent to [ [ 1 2 3 ] [ 4 5 6 ] ] > 2. [abc[1 2 3]qef] is allowed and is equivalent to [ abc [ 1 2 3 ] qef ] > 3. [ "abc""qef" ] is not allowed > > Perhaps someone can suggest a better formulation? The current construction defines data tokens, and that they need a separator between the end of a data token and the beginning of the next. If one if to build a parser that strictly adheres to the specification then [[1 2 3] [4 5 6]] = [ [ 1 2 3 ] [ 4 5 6 ] ] is allowed and [[1 2 3][4 5 6]] is strictly illegal. [abc [1 2 3] qef] = [ abc [ 1 2 3 ] qef ] is allowed and [abc[1 2 3]qef] is strictly illegal. ["abc" "qef"] is allowed and [ "abc""qef" ] is strictly illegal. This would be for the "pedantic" implementation of the specification. However given that we now accept that the terminating sequence is one (or more) characters, irrespective of the space an implementation of the parser can be more liberal. In [[1 2 3][4 5 6]] the first ] terminates by definition the inner list. It SHOULD have a separator but doesn't. The next [ initiates by definition a new list, hence the parser can choose to interpret this as [[1 2 3] [4 5 6]]. An application should ALWAYS write a CIF according to the specification, that is output [[1 2 3] [4 5 6]] and never [[1 2 3][4 5 6]]. BTW James is next me an happy with this interpretation. The same sort of interpretation can be made for the other two examples. >> It is not clear whether white space is allowed adjacent to the >> associative colon. > > The plan was to disallow it for simplicity, although the parsing would be > unambiguous even if whitespace were present Agreed. >> Why does the associative index require quotes? Are there any >> restrictions on the string index such as maximum length, or whether it >> can contain multiple lines? Is matching case sensitive? > > No restrictions on length, multiple lines possible, case sensitive matching. Agreed though I can't fathom why someone would want multiple lines in a hash index. > Requirement of quotes for simplicity - should we drop this? And ease of parsing. >> Also, the "smart quotes" in the PDF should be fixed to be normal ASCII. >> >> >> Joe >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group > > cheers Nick -------------------------------- Associate Professor N. Spadaccini, PhD School of Computer Science & Software Engineering The University of Western Australia t: +61 (0)8 6488 3452 35 Stirling Highway f: +61 (0)8 6488 1089 CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick MBDP M002 CRICOS Provider Code: 00126G e: Nick.Spadaccini@uwa.edu.au _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- References:
- Re: [ddlm-group] Summary of proposed CIF syntax changes (James Hester)
- Prev by Date: Re: [ddlm-group] Data names
- Next by Date: Re: [ddlm-group] Elide close quotes by doubling?
- Prev by thread: Re: [ddlm-group] Summary of proposed CIF syntax changes
- Next by thread: [ddlm-group] Syntax summary? Wiki?
- Index(es):