[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
[ddlm-group] A final non-delimited string definition.
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: [ddlm-group] A final non-delimited string definition.
- From: Nick Spadaccini <nick@csse.uwa.edu.au>
- Date: Mon, 08 Feb 2010 13:28:28 +0800
- Authentication-Results: postfix;
- In-Reply-To: <C795BACB.12C49%nick@csse.uwa.edu.au>
Until now the definition of a non delimited string was ----------------------- A data value in CIF2 may be a non-delimited string of UTF-8 characters, but excluding the ASCII characters, : { } [ ]. As with CIF1, the first character of a non-delimited string cannot be any of the ASCII characters, " ' _ $, since these have special meaning. A non-delimited string cannot exactly match any STAR keyword, loop_ global_ save_* stop_ data_*, where * refers to zero or more characters. ----------------------- We accept white space delimited lists and tables, and if we restrict ourselves to table indices being delimited strings then we can re-inject : { } [ and ] as an allowed character with in a non-delimited string. This will significantly minimise handling issues with legacy CIFs and the need for remediation. This means the "users" (and I never know who these people are, but they want to be able to do everything) can have everything in a non-delimited string except <whitespace> or { [ " ' _ $ at the beginning or } ] at the end and can't exactly match a STAR keyword. I actually don't think this is a good way to go but their seems to be a propensity of belief that users want all of this freedom, so I am happy for parser developers and future dREL implementers to deal with it if the rest of the group think it desirable. On 8/02/10 12:57 PM, "Nick Spadaccini" <nick@csse.uwa.edu.au> wrote: > This is example is made more convoluted by including , in lists, which have > no meaning. If I recall correctly we agreed on space delimited list values. > If we stick to that then the confusions below disappear. It also removes the > dangling comma and double comma problem. > > The parsing with a compound data type is exactly the same as outside a > compound data type (hence you can build a simpler recursive descent parser). > > The correct definition of the list below is > > [1 #one > 2 #two > 3 #three > 4 #four > ] > > Which is the list [1 2 3 4] with embedded comments #one #two #three and > #four. > > The list below is (I add the quotes for clarity) > ["1,#one" "2," 3 ",4"] with the embedded quotes #two #three and #four > > > On 24/12/09 2:58 AM, "Joe Krahn" <krahn@niehs.nih.gov> wrote: > >> James Hester wrote: >>> I would answer as follows: >> ... >>> 2) What are the rules for comments within lists and tables? >>> >>> I would treat them as whitespace >> One detail is whether the "#" starting a comment requires preceding >> whitespace. Herbert's example is: >> >> [1,#one >> 2, #two >> 3 #three >> ,4 #four >> ] >> >> He suggests that preceding whitespace is there only if needed to >> terminate the preceding token, and not a requirement in the actual >> comment syntax. This looks OK in the above example, but may not be clear >> after a quoted string: >> "string"#comment >> >> Or, perhaps this is no less clear than the lack of whitespace in a list: >> ["string"] >> >> >>> 4) Why require single or double quotes for table index strings, rather >>> than just follow the normal quoting rules? >>> >>> >>> No good reason - so let's just follow the normal rules. >>> >> Should quotes be requires at all for the index string? Correct parsing >> only requires quotes if the index string contains a colon. In the >> current draft, that is imposed for all strings, not just table-index >> strings. So, there is no need to mandate quotes here, unless the global >> requirement to quote strings with : is dropped. >> >> Maybe the intention was to disallow multi-line index strings? >> >> >>> Some of these are more technical details compared to the other issues. >>> These came up while I was working on a big CIF2 regular-expression, >>> where parsing details have to be considered more carefully. >>> >>> Actually, it would be rather good if you could post these regular >>> expressions once we have a final specification, as they are likely to be >>> useful to a broad audience. >> I plan to do that. To be fully functional, it has to be done in Perl >> syntax, which has a feature that allows recursion for table and list >> values. Despite that caveat, it will be useful even where the full >> recursive expression will not work. >> >> Joe >> _______________________________________________ >> ddlm-group mailing list >> ddlm-group@iucr.org >> http://scripts.iucr.org/mailman/listinfo/ddlm-group > > cheers > > Nick > > -------------------------------- > Associate Professor N. Spadaccini, PhD > School of Computer Science & Software Engineering > > The University of Western Australia t: +61 (0)8 6488 3452 > 35 Stirling Highway f: +61 (0)8 6488 1089 > CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick > MBDP M002 > > CRICOS Provider Code: 00126G > > e: Nick.Spadaccini@uwa.edu.au > > > > > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group cheers Nick -------------------------------- Associate Professor N. Spadaccini, PhD School of Computer Science & Software Engineering The University of Western Australia t: +61 (0)8 6488 3452 35 Stirling Highway f: +61 (0)8 6488 1089 CRAWLEY, Perth, WA 6009 AUSTRALIA w3: www.csse.uwa.edu.au/~nick MBDP M002 CRICOS Provider Code: 00126G e: Nick.Spadaccini@uwa.edu.au _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] A final non-delimited string definition. (James Hester)
- References:
- Re: [ddlm-group] CIF2 Syntax all wrapped up? (Nick Spadaccini)
- Prev by Date: Re: [ddlm-group] Case sensitivity
- Next by Date: Re: [ddlm-group] A final non-delimited string definition.
- Prev by thread: Re: [ddlm-group] CIF2 Syntax all wrapped up?
- Next by thread: Re: [ddlm-group] A final non-delimited string definition.
- Index(es):