[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft

To: Group finalising DDLm and associated dictionaries <[email protected]>
Subject: Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft
From: James Hester <[email protected]>
Date: Wed, 10 Aug 2011 09:43:56 +1000
In-Reply-To: <[email protected]>
References: <CAM+dB2e57=tW6xMCrmAgXRGKFBERyYRX1LtNJKZEX-xe4F9L=w@mail.gmail.com><[email protected]><CAM+dB2c5fZ1UdVZ=3yCYLOLWp-beU+JJouumgkDf03VWj6gWqQ@mail.gmail.com><[email protected]><CAM+dB2cQFeo_OpgS-QOB8ffETGnwByRhxJFcn-MhGAjpE74oUQ@mail.gmail.com><[email protected]>

Dear Herbert - For this draft to be a useable document, we must strictly limit the scope of any possible incompatible revisions.� Could you please therefore list what particular further revisions you have in mind so that we can assess how significant these might be.

We should be very close to a final draft so any, even minor, incompatible changes must be dealt with ASAP.

On Tue, Aug 9, 2011 at 7:48 PM, Herbert J. Bernstein <[email protected]> wrote:

Dear James,

�We see things very differently. �I find it difficult to understand
the interactions of a syntax document with pieces in isolation and
divorced from semantics. �Others may or may not have a similar view.
I am just one person.

I find John's next iteration very useful. �It makes it clear that
I have failed clearly to express my concerns on the handling of the text field final newline and it does not speak
to the status of the common semantic features document in a CIF2
context, so my vote on this proposal is negative. �As I said, I
am just one person, but that is my opinion. �I believe that, while
the combination of CIF 1.1, DDLm, dREL and the new dictionaries
with be very useful and is ready for use, that this version of CIF 2
is not quite ready for use in user data files, yet.

However, I would not wish to delay getting to that point, so what
I would suggest it that this draft be put into use with the warning
that it is subject to possible further small, but possibly incompatible revisions.

Regards,
�Herbert

=====================================================
�Herbert J. Bernstein, Professor of Computer Science
� Dowling College, Kramer Science Center, KSC 121
� � � �Idle Hour Blvd, Oakdale, NY, 11769

� � � � � � � � +1-631-244-3035
� � � � � � � � [email protected]
=====================================================

On Tue, 9 Aug 2011, James Hester wrote:

Dear Herbert:� I think it is an unreasonable imposition on busy people to
require them to constantly edit and repost a complete draft document with no
guarantee that their work will have moved the discussion forward,
particularly when the proposed changes have been adequately described by
email.� John has nevertheless kindly provided an updated draft (attached,
hopefully the attachment works).

In future, I strongly suggest we proceed as follows:
(1) we first conditionally agree or disagree or request clarification of
proposed amendments;
(2) after any further refinements, those amendments are put into the syntax
draft in the expectation that only technical changes will be necessary;
(3) we all have a final chance to check that we are satisfied with this
final draft.

I will address your comments on 'numb' type in the appropriate thread.

I specifically did not propose reserving the various letter prefixes and I'm
glad that you have clarified your position.� Given that we have already had
extensive discussion around triple quoted string semantics, the most
appropriate course to follow now would be to vote on how much of the python
triple-quoted string space to reserve.� I think that vote will need to wait
until we have agreement on the current amendments.

On Tue, Aug 9, 2011 at 1:51 AM, Herbert J. Bernstein
<[email protected]> wrote:
� � �Dear James,

� � �� I don't explicitly agree or disagree with John B's proposed
� � �amendments, because I have not seem them in the context of a
� � �completed document.
� � �Especially in a pure syntax document isolated from semantics,
� � �"God and
� � �the Devil are in the details," i.e. in the precise form of the
� � �interactions among the productions of the language. �John's
� � �latest
� � �message did not adopt
� � �my examples and clarifications, but promised some unspecified
� � �other
� � �examples, "Yesterday I wrote an example for combining the two
� � �protocols, and I can easily write one or two for the
� � �line-folding
� � �protocol on its own."
� � �John, himself seem uncomfortable with my clarifications. �Nobody
� � �other than
� � �you has even accepted the interpretations of the text field
� � �termination.
� � �I look forward to a revised proposal bringing together what has
� � �been
� � �discussed thus far and we can all consider if it does or does
� � �not express
� � �an acceptable set of revisions to the existing CIF 1.1
� � �specification.

� � �[edit]

�
� � �� Most interestingly -- while I proposed a specific reservation
� � �for the
� � �Python 2.7 treble quote syntax, you have proposed �a reservation
� � �for
� � �"all strings commencing with triple double quotes or triple
� � �quotes,"
� � �which excludes the necessary reservations for the python style
� � �elides
� � �which impact when a string terminates and excludes the r""",
� � �b""", u"""
� � �treble quote initiators. �I propose the following explicit
� � �wording
� � �be in the CIF2 document:

� � �� Means of extending the string quoting mechanisms in CIF are
� � �under
� � �consideration. �Strings conforming to the Python 2.7 triple
� � �quote
� � �syntax as specified in section 2.4.1 of
� � �http://docs.python.org/reference/lexical_analysis.html and
� � �strings conforming to the Python 3.2 syntax as specified in
� � �section
� � �2.4.1 of
� � �http://docs.python.org/py3k/reference/lexical_analysis.html
� � �are reserved for future use and should not be used for any
� � �purpose
� � �that conflicts with the Python 2.7 or 3.2 interpretation in a
� � �CIF 2
� � �document. �In particular, string literals beginning with ''' or
� � �"""
� � �with or without any of the prefixes r, R, u, U, b, B or the
� � �various
� � �combinations of prefixes allowed in python should not be used
� � �with
� � �a meaning that conflicts with the python interpretation.

� � �While I personally prefer immediate adoption of the Python 2.7
� � �flavor
� � �of treble quotes, this wording leaves all options from
� � �non-adoption
� � �to full adoption of 2.7 or 3.2 on the table.

� � �� Regards,
� � �� Herbert

� � �At 2:10 PM +1000 8/8/11, James Hester wrote:
>Dear Herbert: you have not explicitly stated that you agree with
>John B's proposed amendments, and so naturally no updated draft has
>been produced. �The latest comment on this issue was from John B and
>I have seen no reply to that. �Therefore, please indicate all those
>amendments that John B. has proposed that you are satisfied with so
>that they can be incorporated into the document.
>
>As you have in the past argued for the inseparability of syntax and
>semantics, I have raised the various issues around 'numb' type and
>indeed the internal semantics of strings in order to make sure that
>they are addressed in a joint fashion. �As a result of recent
>discussions here, my belief is that there are no changes necessary
>to CIF1.1 common semantics except:
>(i) the issues around line folding protocol and Grazulis protocol
>that we are currently discussing
>(ii) clarification of the meaning of 'numb' type.
>
>I was suggesting that we reserve all strings commencing with triple
>double quotes or triple quotes. �How does this differ from your
>proposal below?
>
>On Mon, Aug 8, 2011 at 1:26 PM, Herbert J. Bernstein
><<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>wrote:
>
>Dear Colleagues,
>
> � I have not seen an updated draft in response to my comments. �Have
>I missed something? �The last draft I see on the IUCr web site is
>the one from 27 July, and that document does not include any of the
>clarifications and comments that were in John's emails, especially
his
>email of 2 August.
>
> � In addition, we have a sharp disagreement on the issue of syntax
>without semantics. �While I would prefer to have syntax and semantics
>dealt with as a whole, at the very least we need to clarify the
status
>of the existing common semantics features document in the context
>of CIF2. �Do all features remain valid or are some deprecated or are
>CIF2 semantics something for some future effort?
>
> � If we are "to reserve triple-quoted strings for future expansion"
then
>we need to be specific about what we are reserving. �I would propose
>reserving all sequences conforming to all the python 2.7 treble quote
>lexing rules for future expansion.
>
> � �My vote on the 27 July document without further clarification
will,
>with regret, have to be negative. �While I am a strong proponent of
>moving forward with dREL, DDLm and the new dictionaries, I believe
>that the combination of CIF 1.1, dREL, DDLm and the new dictionaries
>would better serve the community at this time than would use of CIF 2
>as currently specified, except to the minimal extent needed to
>write the dictionaries. �That way all current CIF data sets and most
>CIF software will remain valid and people can continue to work as
they
>are working until a complete new specification with supporting
software
>is ready for them.
>
> � Regards,
> � � Herbert
>
>P.S. �While writing this message, James' message that says:
>
>At 1:10 PM +1000 8/8/11, James Hester wrote:
>>I am not proposing a change to CIF1.1 behaviour, as I have stated
>>before, so any 'asking for trouble' is purely CIF1.1 asking for
>>trouble.
>>
>>The cif2cif example has focused my thinking. Given that I am not
>>actually proposing anything new, there are no consequences for such
>>programs in clarifying that CIF1.1 'numb' datavalues have a dual
>>'number'/'char' datatype.
>
>I do not understand how this addresses the design of a rule of 19 to
>rule of 9 converter such as cif2cif, nor do I understand how this
>addresses the handling of ISSNs and page ranges without a dictionary
>to say the values involved are ISSNs and page ranges. �I would
suggest
>this issue be on the closed meeting agenda. �Perhaps we can find
>some mutual agreement. -- HJB
>
>
>At 12:26 PM +1000 8/8/11, James Hester wrote:
>>As I understand it, the current position of this group on the latest
>>draft, incorporating Saulius's suggestions, is generally positive.
>>Herbert has raised some technical concerns, which I believe that
>>John B. has answered more than adequately. �If any concerns remain
>>among any members of this group, please advise as to what they are
>>and (if relevant) how the draft should be changed to address them.
>>
>>The related discussion topic was waiting for the latest draft was
>>the fate of triple-quoted strings. �Assuming that the latest CIF2
>>syntax draft is accepted, and given the lack of agreement on
>>triple-quoted string syntax and semantics, should we simply drop
>>triple-quoted strings from the CIF2 syntax, possibly reserving
>>triple quotes for future expansion? �I would appreciate anybody with
>>an opinion on this contributing now so that we can have a hope of
>>wrapping it up before or during Madrid.
>>
>>
>>
>>On Wed, Aug 3, 2011 at 11:52 AM, Herbert J. Bernstein
>
>�><<mailto:<mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.co
m><mailto:yaya@bernstein-plus-sons.com>yaya@bernstein-plus-sons.com>
>
> �>wrote:
>>
>> � > I take it, then, that you do not find the Sugar\nFlour\nButter
example
> �>> �in the current draft to be sufficient for this purpose. �Fair
enough,
>>> � but that leaves me uncertain of what kind of clarification you
are
>> � > looking for. �Perhaps you would be willing to suggest something
>>> � specific?
>>>
>>
>>I would suggest a clear statement of the intended meaning for
>>
>>;abcd
>>;
>>
>>;\
>>ab\
>>cd
>>;
>>
>>;\
>>ab\
>>cd\
>>;
>>
>>;CIF>\\
>>CIF>ab\
>>CIF>cd
>>;
>>
>>The combination of the current draft and your email leads me to
>>suspect that all of these may be intended to be equivalent
>>to "abcd" and to 'abcd' and to a blank delimited abcd
>>
>>Is that correct? �If not, please disambiguate as appropriate.
>>
>>
>>> � If it would help, I would be happy to add a brief clarifying
remark to
>>> � Change 11 that summarizes the status of comment line folding in
CIF2.
>>> � For example: "Although CIF 1.1's common semantic features
include an
>>> � analogous line-folding protocol for comments, that protocol is
not
>>> � incorporated into CIF 2.0 _syntax_. �Although it remains outside
the
>>> � scope of CIF syntax, it is anticipated that some CIF 2.0
processors will
>>> � continue to recognize that protocol."
>>
>>I understand neither your analysis nor your suggested wording.
>>You seem to be arguing issues not in dispute.
>>How about the following?
>>
>>"The analagous line-folding protocol for comments specified in
>>paragraph 26 of the common semantic features of CIF 1.1
>>remains a common semantic feature of CIF 2. �There is no
>>change in comment syntax between CIF 1.1 and CIF2."
>>
>>
>>> � That is a question of application design, not CIF syntax. �It is
>>> � perilous to write files using that formalism, as some CIF
processors
>>> � would certainly reject them, but that's outside the scope of the
spec.
>>> � The spec merely defines that such files are not well-formed
CIFs. �As
>>> � for reading files that use it, I adapt an old saw from the
Fortran
>>> � community: if the file does not comply with the CIF
specifications then
>>> � a processor may do anything it wants with it, including starting
World
>>> � War III. �I do trust that most CIF readers will exercise greater
>>> � restraint, however.
>>
>>This is a technically defensible but impractical position.
>>Some syntax errors make it impossible to guess the intent
>>of the text. �Some syntax errors have a clear intent.
>>Most syntax errors are in some fuzzy middle ground.
>>The normal practice is extending languages is to try
>>to add new constructs somewhere in the middle ground
>>of syntax errors with clear intent or at the boundary.
>>It is in large part becuase of the espousal of a similar
>>position to yours by X3J3 that I have a supply of
>>bumper stickers that say "Save Fortran -- Ban X3J3"
>>The community voted with its feet (and programs) and
>>the current Fortran practice is for compilers to
>>compile almost everything that looks like a reasonable
>>variant of Fortran-77, Fortran-8x, Fortran-9x and Fortran-2003
>>with a minimum of fuss. �I was just compiling a Fortran-77
>>program with the latest gfortran and it accepted the program
>>happily and without a single warning (even though I
>>used -Wall). �I use the same compiler to handle rather
>>recent Fprtran-2003 code including code with the new ISO
>>C binding to allow mixed C and Fortran. �I think the
>>current approach to be a much better way to design
>>processing software than the old X3J3 approach of the
>>1980 and early 1990s that kept breaking old programs.
>>
>>=====================================================
>> � Herbert J. Bernstein, Professor of Computer Science
>> � � Dowling College, Kramer Science Center, KSC 121
>> � � � � �Idle Hour Blvd, Oakdale, NY, 11769
>>
>
> �>
> <tel:%2B1-631-244-3035><tel:%2B1-631-244-3035>+1-631-244-3035
>>
>><mailto:<mailto:yaya@dowling.edu>[email protected]><mailto:[email protected]>
[email protected]
>
> �>=====================================================
>>
>>On Tue, 2 Aug 2011, Bollinger, John C wrote:
>>
>>> � Dear Herbert,
>>>
>>> � On Monday, August 01, 2011 5:48 PM, you wrote:
>>>
>>>> � The present change document is unclear about the non-inclusion
of the
>>>> � terminal linefeed in all text fields.
>>>
>>>
>>> � I take it, then, that you do not find the Sugar\nFlour\nButter
example
> �>> �in the current draft to be sufficient for this purpose. �Fair
enough,
>>> � but that leaves me uncertain of what kind of clarification you
are
>>> � looking for. �Perhaps you would be willing to suggest something
>>> � specific?
>>>
>>> � [...]
>>>
>>>
>>>> � If the comment side of the original line-folding protocol is
>> � >> acceptable, the change document should say so. �Otherwise, by
>>>> � explicitly including the text field part of paragraph 26, but
>>>> � not the comment part, the impression might be created that
>>>> � the comment line folding is excluded from CIF2.
>>>
>>>
>>> � Comment folding is acceptable, _and_ it is excluded *from
>>> � standardization* into CIF *syntax*. �These are compatible
because 1)
>>> � Folding and unfolding comments does not change the syntactic
validity of
>>> � CIFs 2) Comments have no meaning to CIF, other than constituting
>>> � whitespace, so folded and unfolded forms are syntactically,
>>> � grammatically, and even semantically equivalent from a CIF
perspective.
>>>
>>> � Other kinds of comment transformation are in the same class, and
are
>>> � equally acceptable and equally non-standardized. �For example,
one could
>>> � imagine transforming CIF comments by adding, removing, or
regularizing
>>> � whitespace between the comment start character and the first
printable
>>> � character.
>>>
>>> � The CIF syntax specifications do not require any particular
handling of
>>> � comments beyond treating them as whitespace. �Processors aren't
required
>>> � even to retain them or pass them on to an application, though
they are
>>> � certainly permitted to do so. �Likewise, they are permitted
perform any
>>> � transformation they wish on comment bodies. �There is no
advantage in
>>> � choosing any one particular transformation to promote from a
"may" to a
>>> � "must".
>>>
>>> � Text field prefixing provides a good contrast. �It must be
included in
>>> � the syntax if we want it, because it imposes additional syntax
>>> � requirements on text fields (either their bodies must not start
with a
>>> � prefix or every line must be prefixed as specified by the
protocol).
>>>
>>> � Text field line folding is in the middle. �It doesn't impose any
>>> � additional syntactic constraints, but its inclusion would be
justified
>>> � by its role in ensuring that CIF syntax is capable of expressing
>>> � arbitrary string values. �There is no analogous general mandate
for CIF
>>> � comments.
>>>
>>> � If it would help, I would be happy to add a brief clarifying
remark to
>>> � Change 11 that summarizes the status of comment line folding in
CIF2.
>>> � For example: "Although CIF 1.1's common semantic features
include an
>>> � analogous line-folding protocol for comments, that protocol is
not
>>> � incorporated into CIF 2.0 _syntax_. �Although it remains outside
the
>>> � scope of CIF syntax, it is anticipated that some CIF 2.0
processors will
>>> � continue to recognize that protocol."
>>>
>>>
>>>> � The question on a terminal ;\ was not whether is it
syntactically
>>>> � correct under the current CIF2 document, but what the document
>>>> � expects us to do about it.
>>>
>>>
>>> � The CIF syntax specification does not answer that question. �CIF
syntax
>>> � places no particular expectations on what processors should do
with
>>> � input that fails to be well-formed CIF. �Indeed, it places very
few
>>> � expectations even on what they should do with input that *is*
>>> � well-formed CIF.
>>>
>>>
>>>> � �It comes up because in existing
>>>> � validation suites for the line-folding protocol under CIF1,
rather
>>>> � than treating as an error, it uses it as a way to allow an
>>>> � embedded \n; in a line-folded text field. �Inasmuch as we are
>>>> � in agreement that \n;\ is not a syntactically valid termination
>>>> � of a text field in CIF2 as defined in the change document,
there
>>>> � is no harm in those of us who use the construct under as a
>>>> � non-conflicting extension to CIF1 to continue to do so under
CIF2.
>>>
>>>
>>> � That is a question of application design, not CIF syntax. �It is
>>> � perilous to write files using that formalism, as some CIF
processors
>>> � would certainly reject them, but that's outside the scope of the
spec.
> �>> �The spec merely defines that such files are not well-formed
CIFs. �As
>>> � for reading files that use it, I adapt an old saw from the
Fortran
>>> � community: if the file does not comply with the CIF
specifications then
>>> � a processor may do anything it wants with it, including starting
World
>>> � War III. �I do trust that most CIF readers will exercise greater
>>> � restraint, however.
>>>
>>>
>>> � Regards,
>>>
>>> � John
>>>
>>> � --
>>> � John C. Bollinger, Ph.D.
>>> � Department of Structural Biology
>> � > St. Jude Children's Research Hospital
>>>
>>>
>>> � Email Disclaimer:
>
> �>>
><<http://www.stjude.org/emaildisclaimer>http://www.stjude.org/emaildisclai
mer><http://www.stjude.org/emaildisclaimer>www.stjude.org/emaildisclaimer
>>>
>>> � _______________________________________________
>>> � ddlm-group mailing list
>>>
>>><mailto:<mailto:ddlm-group@iucr.org>[email protected]><mailto:ddlm-group@
iucr.org>[email protected]
>>>
>>><<http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iuc
r.org/mailman/listinfo/ddlm-group><http://scripts.iucr.org/mailman/listinfo
/ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group
>>>
>>_______________________________________________
>>ddlm-group mailing list
>><mailto:<mailto:ddlm-group@iucr.org>[email protected]><mailto:ddlm-grou
[email protected]>[email protected]
>><<http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr
.org/mailman/listinfo/ddlm-group><http://scripts.iucr.org/mailman/listinfo/
ddlm-group>http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
> �>
>>
>>
>>
>>--
>>T <tel:%2B61%20%2802%29%209717%209907>+61 (02) 9717 9907
>>F <tel:%2B61%20%2802%29%209717%203145>+61 (02) 9717 3145
>>M <tel:%2B61%20%2804%29%200249%204148>+61 (04) 0249 4148
>>
>
> �>_______________________________________________
>>ddlm-group mailing list
>><mailto:[email protected]>[email protected]
>><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.
org/mailman/listinfo/ddlm-group
>
>--
>
>=====================================================
> �Herbert J. Bernstein, Professor of Computer Science
> � �Dowling College, Kramer Science Center, KSC 121
> � � � � Idle Hour Blvd, Oakdale, NY, 11769
>
> � � � � � � � � �<tel:%2B1-631-244-3035>+1-631-244-3035
> � � � � � � � � �<mailto:[email protected]>yaya@dowling.edu
>=====================================================
>
>_______________________________________________
>ddlm-group mailing list
><mailto:[email protected]>d[email protected]
><http://scripts.iucr.org/mailman/listinfo/ddlm-group>http://scripts.iucr.o
rg/mailman/listinfo/ddlm-group
>
>
>
>
>--
>T +61 (02) 9717 9907
>F +61 (02) 9717 3145
>M +61 (04) 0249 4148
>
>_______________________________________________
>ddlm-group mailing list
>[email protected]
>http://scripts.iucr.org/mailman/listinfo/ddlm-group

--
=====================================================
�Herbert J. Bernstein, Professor of Computer Science
� �Dowling College, Kramer Science Center, KSC 121
� � � � Idle Hour Blvd, Oakdale, NY, 11769

� � � � � � � � �+1-631-244-3035
� � � � � � � � �[email protected]
=====================================================
_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148

_______________________________________________
ddlm-group mailing list
[email protected]
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]

Follow-Ups:

Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft (Herbert J. Bernstein)

References:

[ddlm-group] Triple-quoted strings in light of latest CIF2 draft (James Hester)

Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft (James Hester)

Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft (James Hester)

Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft (Herbert J. Bernstein)

Prev by Date: Re: [ddlm-group] CIF2 semantics

Next by Date: Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft

Prev by thread: Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft

Next by thread: Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft

Index(es):

Date

Thread

Discussion List Archives

Re: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft