[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Reply to: [list | sender only]
Re: [ddlm-group] Python-type eliding for triple-quoted strings
- To: ddlm-group <ddlm-group@iucr.org>
- Subject: Re: [ddlm-group] Python-type eliding for triple-quoted strings
- From: James Hester <jamesrhester@gmail.com>
- Date: Tue, 4 Jan 2011 00:19:41 +1100
- In-Reply-To: <AANLkTi=KRObuU61HryEUBCx=Od-RsL8GxsGWwZZ097ZK@mail.gmail.com>
- References: <AANLkTi=KRObuU61HryEUBCx=Od-RsL8GxsGWwZZ097ZK@mail.gmail.com>
My apologies, I sent this email instead of saving it for further editing. You may disregard the contents until I resend it at a later date. James. On Tue, Jan 4, 2011 at 12:17 AM, James Hester <jamesrhester@gmail.com> wrote: > I am going to divide Ralf's proposal into two parts, which both > separately solve the problem of representing every possible string in > a CIF file. > > Proposal A: strings can be delimited by three quotes or three > apostrophes ("cooked strings" hereafter) or else by three quotes or > three apostrophes immediately preceded by the letter 'r' ("raw > strings"). Both cooked and raw strings define two special sequences: > <backslash><delimiter> and <backslash><backslash>. When these > sequences are encountered in a cooked string, the first backslash is > removed and the second character no longer has any special meaning > (delimiter or elide). When these sequences are encountered in a raw > string, they function as for a cooked string, but the initial > <backslash> is not removed. Note that I have deliberately excluded the > following escape sequences from this proposal as they are not > syntactically relevant: \newline, \a, \b, \f,\n,\r,\t,\v,\ooo, \xhh > > Under Proposal A, the sequence <backslash><delimiter> is represented > as <backslash><backslash><backslash><delimiter> in a cooked string. > In a raw string, it may be left as <backslash><delimiter>. In a raw > string, a string terminating with <delimiter> must contain > <backslash><delimiter> as the last two characters. A raw string > cannot finish with a single <backslash>. > > Proposal B: strings can be delimited by three quotes or three > apostrophes or else by three quotes or three apostrophes immediately > preceded by the letter 'u' ("unicode strings"). In a non-unicode > string, no special behaviour is defined (as in the current CIF2 > proposal). In a Unicode string, the escapes \uxxxx and \Uxxxxxx are > defined as the corresponding Unicode code point. > > > I believe that this scheme is not particularly appropriate for the CIF > context, which is unsurprising given that Python literals are designed > for embedding in programs and CIF literals are intended to encapsulate > arbitrary data. My criticisms are as follows: > > (1) Many of the <backslash><character> sequences in non-raw strings > already have a meaning as IUCr markup or LaTeX markup > (2) The lexer must be informed of the > (2) Raw strings will include the <backslash><delimiter> sequence in > the datavalue, meaning that the > > > -- > T +61 (02) 9717 9907 > F +61 (02) 9717 3145 > M +61 (04) 0249 4148 > -- T +61 (02) 9717 9907 F +61 (02) 9717 3145 M +61 (04) 0249 4148 _______________________________________________ ddlm-group mailing list ddlm-group@iucr.org http://scripts.iucr.org/mailman/listinfo/ddlm-group
Reply to: [list | sender only]
- Follow-Ups:
- [ddlm-group] Python-type eliding for triple-quoted strings (James Hester)
- References:
- [ddlm-group] Python-type eliding for triple-quoted strings (James Hester)
- Prev by Date: [ddlm-group] Python-type eliding for triple-quoted strings
- Next by Date: [ddlm-group] Python-type eliding for triple-quoted strings
- Prev by thread: [ddlm-group] Python-type eliding for triple-quoted strings
- Next by thread: [ddlm-group] Python-type eliding for triple-quoted strings
- Index(es):