Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] THREAD TRIPLE QUOTES - Specification

Dear Colleagues,

    Inasmuch as we have already generated many CIFs on under the existing
handling of the reverse solidus, which was to treat it as subordinate
to the existing quoting schemes, I would suggest that we retain the
same ordering of lexical scan for the treble quotes.  This would
allow us to keep the handling of embedded treble quotes within treble
quotes without much special handling at all -- just break up any
embedded treble quotes with a reverse solidus.  To be more specific,
here is what I would suggest:

1.  On reads:
     1.1.  The reverse solidus is an ordinary character in the lexical
scan of a quoted string;
     1.2.  At the level of a CIF we retain the rule that no terminal
quote mark is recognized unless followed by whitespace
     1.3.  At the level of a CIF we strip trailing whitespace from all
lines prior to the lexical scan
     1.4.  That, on read, recognition of the reverse solidus is an
optional semantic interpretation (perhaps handled by a second level
lexical scan or handled in the application) following the same
rules as Brian laid out for comments, semi-colon delimited strings
and, now, for treble quoted strings.

2.  On writes:
     In writing a treble quoted string, if a treble quote is
encounterd as part of the quoted text, a reverse-solidus-newline
digraph would be inserted after the third quote mark, i.e.

"""This is an example
of a treble-quoted

might be written as

This is an example of a treble-quoted

This interesting case to then consider is whether we need to
do any quoting to protect the reverse soliduses (solidii?)
in the example when quoting it

The main advantage of this approach is the ordinary quoted
cif-strings such as

_mugwump "muddy big water"

could then be treble quoted very directly as
"""_mugwump "muddy big water" """

rather than as

"""_mugwump " muddy big water" """

as would be required under Nick's suggestion


  Herbert J. Bernstein, Professor of Computer Science
    Dowling College, Kramer Science Center, KSC 121
         Idle Hour Blvd, Oakdale, NY, 11769


On Fri, 11 Sep 2009, Herbert J. Bernstein wrote:

> The main value of the treble quoted string is that it allows a much
> neater presentation of examples of chunks of CIFS and text in which
> presenting such information within semi-colon quoted strings gets
> somewhat confusing.
> For this reason, I would suggest that the most important test of
> Nick's suggestions would be how faithfully a semi-colon delimited
> example could be included with _no_ added or subtracted characters,
> so that people reading dictionaries by eye will reproduce those
> examples correctly.
> For that reason, I hope we can stay as close to """ and ''' delimiting
> truly raw data as possible.
> Regards,
>  Herbert
> =====================================================
> Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>        Idle Hour Blvd, Oakdale, NY, 11769
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
> On Fri, 11 Sep 2009, Nick Spadaccini wrote:
>> Our last discussion on the implementation of triple quoted strings resulted
>> in much to-ing and fro-ing and in the end the conclusion was that its
>> behaviour was to be identical to the semi-colon delimited strings. I
>> preferred a greater degree of parsing of the string but this was not
>> popular.
>> Now our illustrious chair, who sits next to me now, asks the question "what
>> is the point of the triple quoted string", to which I can only shrug my
>> shoulders. The triple quote string will be useful for containing strings
>> that include ", ' and ; (in the first character in a record). They will of
>> course fail when you attempt to include the sequence """. Hence they are no
>> different to a ; delimited string that cannot include a ; as the first
>> character of a line.
>> Here is a suggestion. The triple quote string (delimited by """) will treat
>> its contents a raw, except that
>> (a) When writing the string, ALL quotes contained within will have a space
>> inserted immediately after the " character. This will allow the triple 
>> quote
>> to be contained within the string by breaking the sequence with spaces so
>> the tokeniser is not fooled in to terminating the string. Clearly the
>> reverse operation is required in reading the string. I this way is is
>> possible to include all manner of text, markup and programming scripts
>> within a triple quoted string.
>> (b) We will formally accept in this string the "eliding" of the newline
>> character. Hence a reverse solidus (\) immediately prior to the record
>> terminating character(s) will imply the \ and the record terminating
>> characters are deleted from the stream, and the next line is wrapped 
>> around.
>> To allow for the odd case when one want's to literally include the \ and 
>> the
>> record terminating character(s) in the string then the required \ will be
>> elided.
>> In parsing the contents of a string the only things required are
>> (1) delete one of the spaces after every "
>> (2) treat \<newline> as a wrap around
>> (3) treat \\<newline> as the raw string \<newline>
>> All other characters are left as is.
>> cheers
>> Nick
>> --------------------------------
>> Associate Professor N. Spadaccini, PhD
>> School of Computer Science & Software Engineering
>> The University of Western Australia    t: +61 (0)8 6488 3452
>> 35 Stirling Highway                    f: +61 (0)8 6488 1089
>> CRAWLEY, Perth,  WA  6009 AUSTRALIA   w3: www.csse.uwa.edu.au/~nick
>> MBDP  M002
>> CRICOS Provider Code: 00126G
>> e: Nick.Spadaccini@uwa.edu.au
>> _______________________________________________
>> ddlm-group mailing list
>> ddlm-group@iucr.org
>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
ddlm-group mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.