Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Wrapping up the elide discussion

I believe that proposal F has the most support in this group and among
the voting COMCIFS members.  I reach this conclusion by assuming that
Ralf will prefer F and Brian and myself prefer F'.

I will shortly post a draft of the proposed change for technical
comment prior to requesting a COMCIFS vote.

James

On Mon, Jan 31, 2011 at 1:36 AM, John Westbrook <jwest@rcsb.rutgers.edu> wrote:
> I concur with Herbert and opt for the option F of those under consideration.
>
> I would appreciate an example of how to embed a triple quoted text section
> verbatim within a triple quoted section.   This is an issue for dictionary
> examples.  Does the proposal include both """ and ''' so that the string
> """'''my verbatim text'''""" is treated as '''my verbatim text'''?
>
> John
>
>
> On 1/30/11 9:00 AM, Herbert J. Bernstein wrote:
>> If the choice is only between F and F', I vote for F.
>>
>> To clarify:
>>
>> James' F' proposal was:
>>
>> "The datavalue is obtained from the triple-quoted string in two steps:
>> (1) All instances of<backslash><eol>  are removed from the string
>> where the<backslash>  is not preceded by another<backslash>
>> (2) All other instances of<backslash><eol>  are replaced with<eol>
>>
>> "This means that a sequence of n backslashes followed by newline is
>> replaced by a sequence of n-1 backslashes followed by newline, except
>> if there is one backslash before the newline, in which case both
>> newline and backslash are removed.  Triple quote sequences are elided
>> by inserting a<backslash><eol>  sequence between<delimiter>
>> characters to break up the triple delimiter sequence.  Note also that
>> backslash has no special meaning if not in a sequence finishing with
>> <eol>."
>>
>> Simon's F proposal was
>>
>> "If you're looking to base CIF extensions on established mechanisms,
>> why not adopt
>> the minimal \(newline) and \\ escape sequences, which in essence are
>> the same as
>> the established CIF line-folding protocol (just dropping the initial
>> \ following the opening
>> delimiter and formalising the protocol as an inherent part of the
>> spec). Afterall, I beleive you
>> have already been using it, or at least interpreted it, as a means to
>> escape 'semicolon delimiters' within
>> semicolon-delimited values (I seem to recall discussions that
>> identified an issue with the published 'trip tests'
>> relating to line folding)."
>>
>> Under Simon's F proposal
>>
>> """\\\
>> """
>>
>> would mean one backslash (no trailing new line)
>>
>> and
>>
>> """\\
>> """
>>
>> would mean one backslash followed by a newline
>>
>> and
>>
>> """\\
>>
>> """
>>
>> would mean one backslash followed by two newlines
>>
>> while under James' F'
>>
>> """\\\
>> """
>>
>> would mean two backslashes (no trailing newline)
>>
>> and
>>
>> """\\
>> """
>>
>> would mean one backslash (no trailing newline)
>>
>> and
>>
>> """\\
>>
>> """
>>
>> would mean one backslash followed by a newline
>>
>>
>> While either proposal could, of course, be implemented, to me,
>> Simon's proposal is seems complete and more consistent with
>> common programming practice in handling backslash elides
>>
>> I agree with James that it is time to make a choice and move
>> on.  I just hope, if we cannot follow complete Python
>> practice, we at least take F, the proposal that is more
>> consistent with Python practice.
>>
>>
>>
>> At 7:57 AM -0500 1/30/11, Frances C. Bernstein wrote:
>>> Date: Sun, 30 Jan 2011 23:40:58 +1100
>>> From: James Hester<jamesrhester@gmail.com>
>>> Reply-To: Group finalising DDLm and associated dictionaries
>>>      <ddlm-group@iucr.org>
>>> To: ddlm-group<ddlm-group@iucr.org>
>>> Subject: [ddlm-group] Wrapping up the elide discussion
>>>
>>> Dear DDLm-ers,
>>>
>>> This latest round of discussion started as an attempt to find
>>> consensus on an elide system for CIF2 triple-quoted strings.  I have
>>> asked everybody to contribute their preferences, and now that John W
>>> and Ralf have replied to me off-list regarding their preferences for
>>> elides, we are in a position to read the tea-leaves and determine a
>>> consensus solution.  I can report that Ralf, while preferring the full
>>> Python approach (proposal P) will accept a solution that allows
>>> arbitrary strings to be included in a CIF file.  John W prefers a
>>> solution involving minimal changes to current syntax.
>>>
>>> So our top preferences are as follows:
>>>
>>> Herbert: P, otherwise F with conditions
>>> Brian: F' and E, P least preferable
>>> James: F' and F, P unacceptable
>>> Ralf: P best, A,B,E,F,F' OK
>>> John W: A, B or F' (my interpretation of minimal changes - John feel
>>> free to say otherwise)
>>>
>>> It appears that all but Herbert would be prepared to vote for F', and
>>> even Herbert is prepared to consider F.  No other proposal reaches a
>>> similar level of acceptance among voting members (and I note that
>>> non-voting members are also strongly in the F/F' camp).  I would
>>> therefore like to focus discussion on F' and F as the two choices most
>>> likely to succeed.
>>>
>>> The single point in favour of F' as opposed to F is that the sequence
>>> <backslash><backslash>  has no meaning, which makes it simpler to
>>> include backslash-rich text (eg LaTeX or RTF).  This continues to be
>>> of particular concern among our journal colleagues.
>>>
>>> The single point that some consider to be in favour of F relative to
>>> F' is that it is a proper subset of Python syntax.
>>>
>>> If no consensus can be achieved following a small period for comment
>>> within this group, I propose voting between F or F', followed by a
>>> formal vote at COMCIFS level to accept the resulting elide system as
>>> an amendment to the current CIF2 standard.
>>>
>>> James.
>>>
>>> --
>>> T +61 (02) 9717 9907
>>> F +61 (02) 9717 3145
>>> M +61 (04) 0249 4148
>>> _______________________________________________
>>> ddlm-group mailing list
>>> ddlm-group@iucr.org
>>> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>
>
> --
> ******************************************************************
>   John Westbrook, Ph.D.
>   Rutgers, The State University of New Jersey
>   Department of Chemistry and Chemical Biology
>   610 Taylor Road
>   Piscataway, NJ 08854-8087
>   e-mail: jwest@rcsb.rutgers.edu
>   Ph:  (732) 445-4290  Fax: (732) 445-4320
> ******************************************************************
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>



-- 
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group


Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.