[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Use of elides in strings
- To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Use of elides in strings
- From: James Hester <jamesrhester@gmail.com>
- Date: Tue, 24 Nov 2009 13:00:59 +1100
- In-Reply-To: <4B0B2157.4090804@niehs.nih.gov>
- References: <C7306520.1258E%nick@csse.uwa.edu.au><572182.92308.qm@web87003.mail.ird.yahoo.com><279aad2a0911230240q278ab08fqc09349148202bed9@mail.gmail.com><4B0ABDF3.4090108@niehs.nih.gov><407817.81146.qm@web87008.mail.ird.yahoo.com><4B0B06A2.9050404@niehs.nih.gov><279aad2a0911231529h3bd3e0b6k98de25088410f536@mail.gmail.com><4B0B2157.4090804@niehs.nih.gov>
OK, my rewritten voting proposal appears to be an abject failure. Let me repeat 1 as clearly as possible 1. Should CIF2 allow elision of terminator characters? In other words, should we make it possible to include <quote> as a normal character in a <quote> delimited string? Herbert: It's difficult to understand how to rephrase things if it is not clear where exactly the problem lies. Joe: good point about double backslash. Consider this added to proposal (a). Before we discuss (2) precisely, can we agree to use the following abstract model and terminology for CIF2 file parsing and dictionary application? If not, please indicate your alternative. 1. A CIF lexer separates a CIF file into tokens according to the CIF2 syntax specification only, that is, this process cannot be altered by DDL directives. 2. A CIF parser accepts the tokens from the lexer. CIF parsers can be modelled as performing at least the following actions with these tokens: (i) assignment of datavalue to dataname (ii) grouping looped datanames into a set (iii) assigning looped datavalues to the appropriate dataname and packet (iv) editing datavalues according to the syntax specification if this has not been performed in the lexer (e.g. stripping enclosing quotes, removing elides) 3. DDL dictionaries operate on and refer to the datavalues and datanames returned by the CIF parser after (2). They have no ability to influence the lexing process, or the parsing actions listed above (in particular the datavalue editing). 4. The 'string value' or 'value' of a token is that value returned by the parser in (2). In particular, this is the value that: (i) may be checked against regular expressions in the dictionary; (ii) is accessed by dREL expressions; (iii) is returned by dREL expressions; (iv) is referred to in dictionary descriptive text; (v) may be passed to client routines for further editing; (vi) may be passed to external applications [Side note: in other words the parser returns the CIF "infoset" and the dictionaries refer to the CIF "infoset", but we haven't been talking in those terms so I've been more explicit]. So my voting question (2) is: should the 'string value' of a token referred to in (4) include the eliding characters? On Tue, Nov 24, 2009 at 10:57 AM, Joe Krahn <krahn@niehs.nih.gov> wrote: > A few points to consider: > > James Hester wrote: > ... >> 2. Character(s) used to indicate elision should be part of the string value > This does not specify where the elision character should be stripped. It > could be done by the parser or the dictionary-level code. The rule only > refers to the final string for the final output text, right? > >> >> Now for the specifics: >> >> 3. Which of the following elision proposals do you support (more than one OK)? >> >> Proposal (a) (intended to correspond to Nick's) >> (i) A character which would otherwise be interpreted as a delimiter >> is elided by immediately preceding it with a reverse solidus. >> (ii) Otherwise a reverse solidus in the string has no special >> lexical significance. >> >> Proposal (b) >> (i) The combinations <reverse solidus><quote> or a <reverse >> solidus><double quote> always signify <quote> and <double quote> >> respectively, regardless of the delimiter used in a particular string. >> (ii) The combinations in (i) elide the <quote> or <double quote> >> character where that character would otherwise terminate the string >> (iii) Apart from (i) and (ii), the reverse solidus has no special >> significance >> (iv) If not used as the string delimiter, <quote> or <double quote> >> when not preceded by <reverse solidus> represent themselves. > > In both forms <reverse solidus><reverse solidus> should also be defined > in order to allow a literal string that ends in <reverse solidus>. For > example, a single <reverse solidus> character has to be written as "\\", > to avoid eliding the close quote. > > Joe Krahn > _______________________________________________ > ddlm-group mailing list > ddlm-group@iucr.org > http://scripts.iucr.org/mailman/listinfo/ddlm-group > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- Re: [ddlm-group] Use of elides in strings (Nick Spadaccini)
- References:
- Re: [ddlm-group] Use of elides in strings (Nick Spadaccini)
- Re: [ddlm-group] Use of elides in strings (SIMON WESTRIP)
- Re: [ddlm-group] Use of elides in strings (James Hester)
- Re: [ddlm-group] Use of elides in strings (Joe Krahn)
- Re: [ddlm-group] Use of elides in strings (SIMON WESTRIP)
- Re: [ddlm-group] Use of elides in strings (Joe Krahn)
- Re: [ddlm-group] Use of elides in strings (James Hester)
- Re: [ddlm-group] Use of elides in strings (Joe Krahn)
- Prev by Date: Re: [ddlm-group] Use of elides in strings
- Next by Date: Re: [ddlm-group] Use of elides in strings
- Prev by thread: Re: [ddlm-group] Use of elides in strings
- Next by thread: Re: [ddlm-group] Use of elides in strings
- Index(es):