Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

[ddlm-group] Triple-quoted strings in light of latest CIF2 draft

  • To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
  • Subject: [ddlm-group] Triple-quoted strings in light of latest CIF2 draft
  • From: James Hester <jamesrhester@gmail.com>
  • Date: Mon, 8 Aug 2011 12:26:14 +1000
As I understand it, the current position of this group on the latest draft, incorporating Saulius's suggestions, is generally positive.  Herbert has raised some technical concerns, which I believe that John B. has answered more than adequately.  If any concerns remain among any members of this group, please advise as to what they are and (if relevant) how the draft should be changed to address them.

The related discussion topic was waiting for the latest draft was the fate of triple-quoted strings.  Assuming that the latest CIF2 syntax draft is accepted, and given the lack of agreement on triple-quoted string syntax and semantics, should we simply drop triple-quoted strings from the CIF2 syntax, possibly reserving triple quotes for future expansion?  I would appreciate anybody with an opinion on this contributing now so that we can have a hope of wrapping it up before or during Madrid.



On Wed, Aug 3, 2011 at 11:52 AM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
> I take it, then, that you do not find the Sugar\nFlour\nButter example
> in the current draft to be sufficient for this purpose.  Fair enough,
> but that leaves me uncertain of what kind of clarification you are
> looking for.  Perhaps you would be willing to suggest something
> specific?
>

I would suggest a clear statement of the intended meaning for

;abcd
;

;\
ab\
cd
;

;\
ab\
cd\
;

;CIF>\\
CIF>ab\
CIF>cd
;

The combination of the current draft and your email leads me to
suspect that all of these may be intended to be equivalent
to "abcd" and to 'abcd' and to a blank delimited abcd

Is that correct?  If not, please disambiguate as appropriate.

> If it would help, I would be happy to add a brief clarifying remark to
> Change 11 that summarizes the status of comment line folding in CIF2.
> For example: "Although CIF 1.1's common semantic features include an
> analogous line-folding protocol for comments, that protocol is not
> incorporated into CIF 2.0 _syntax_.  Although it remains outside the
> scope of CIF syntax, it is anticipated that some CIF 2.0 processors will
> continue to recognize that protocol."

I understand neither your analysis nor your suggested wording.
You seem to be arguing issues not in dispute.
How about the following?

"The analagous line-folding protocol for comments specified in
paragraph 26 of the common semantic features of CIF 1.1
remains a common semantic feature of CIF 2.  There is no
change in comment syntax between CIF 1.1 and CIF2."

> That is a question of application design, not CIF syntax.  It is
> perilous to write files using that formalism, as some CIF processors
> would certainly reject them, but that's outside the scope of the spec.
> The spec merely defines that such files are not well-formed CIFs.  As
> for reading files that use it, I adapt an old saw from the Fortran
> community: if the file does not comply with the CIF specifications then
> a processor may do anything it wants with it, including starting World
> War III.  I do trust that most CIF readers will exercise greater
> restraint, however.

This is a technically defensible but impractical position.
Some syntax errors make it impossible to guess the intent
of the text.  Some syntax errors have a clear intent.
Most syntax errors are in some fuzzy middle ground.
The normal practice is extending languages is to try
to add new constructs somewhere in the middle ground
of syntax errors with clear intent or at the boundary.
It is in large part becuase of the espousal of a similar
position to yours by X3J3 that I have a supply of
bumper stickers that say "Save Fortran -- Ban X3J3"
The community voted with its feet (and programs) and
the current Fortran practice is for compilers to
compile almost everything that looks like a reasonable
variant of Fortran-77, Fortran-8x, Fortran-9x and Fortran-2003
with a minimum of fuss.  I was just compiling a Fortran-77
program with the latest gfortran and it accepted the program
happily and without a single warning (even though I
used -Wall).  I use the same compiler to handle rather
recent Fprtran-2003 code including code with the new ISO
C binding to allow mixed C and Fortran.  I think the
current approach to be a much better way to design
processing software than the old X3J3 approach of the
1980 and early 1990s that kept breaking old programs.
=====================================================
 Herbert J. Bernstein, Professor of Computer Science
   Dowling College, Kramer Science Center, KSC 121
        Idle Hour Blvd, Oakdale, NY, 11769

                 +1-631-244-3035
                 yaya@dowling.edu
=====================================================

On Tue, 2 Aug 2011, Bollinger, John C wrote:

> Dear Herbert,
>
> On Monday, August 01, 2011 5:48 PM, you wrote:
>
>> The present change document is unclear about the non-inclusion of the
>> terminal linefeed in all text fields.
>
>
> I take it, then, that you do not find the Sugar\nFlour\nButter example
> in the current draft to be sufficient for this purpose.  Fair enough,
> but that leaves me uncertain of what kind of clarification you are
> looking for.  Perhaps you would be willing to suggest something
> specific?
>
> [...]
>
>
>> If the comment side of the original line-folding protocol is
>> acceptable, the change document should say so.  Otherwise, by
>> explicitly including the text field part of paragraph 26, but
>> not the comment part, the impression might be created that
>> the comment line folding is excluded from CIF2.
>
>
> Comment folding is acceptable, _and_ it is excluded *from
> standardization* into CIF *syntax*.  These are compatible because 1)
> Folding and unfolding comments does not change the syntactic validity of
> CIFs 2) Comments have no meaning to CIF, other than constituting
> whitespace, so folded and unfolded forms are syntactically,
> grammatically, and even semantically equivalent from a CIF perspective.
>
> Other kinds of comment transformation are in the same class, and are
> equally acceptable and equally non-standardized.  For example, one could
> imagine transforming CIF comments by adding, removing, or regularizing
> whitespace between the comment start character and the first printable
> character.
>
> The CIF syntax specifications do not require any particular handling of
> comments beyond treating them as whitespace.  Processors aren't required
> even to retain them or pass them on to an application, though they are
> certainly permitted to do so.  Likewise, they are permitted perform any
> transformation they wish on comment bodies.  There is no advantage in
> choosing any one particular transformation to promote from a "may" to a
> "must".
>
> Text field prefixing provides a good contrast.  It must be included in
> the syntax if we want it, because it imposes additional syntax
> requirements on text fields (either their bodies must not start with a
> prefix or every line must be prefixed as specified by the protocol).
>
> Text field line folding is in the middle.  It doesn't impose any
> additional syntactic constraints, but its inclusion would be justified
> by its role in ensuring that CIF syntax is capable of expressing
> arbitrary string values.  There is no analogous general mandate for CIF
> comments.
>
> If it would help, I would be happy to add a brief clarifying remark to
> Change 11 that summarizes the status of comment line folding in CIF2.
> For example: "Although CIF 1.1's common semantic features include an
> analogous line-folding protocol for comments, that protocol is not
> incorporated into CIF 2.0 _syntax_.  Although it remains outside the
> scope of CIF syntax, it is anticipated that some CIF 2.0 processors will
> continue to recognize that protocol."
>
>
>> The question on a terminal ;\ was not whether is it syntactically
>> correct under the current CIF2 document, but what the document
>> expects us to do about it.
>
>
> The CIF syntax specification does not answer that question.  CIF syntax
> places no particular expectations on what processors should do with
> input that fails to be well-formed CIF.  Indeed, it places very few
> expectations even on what they should do with input that *is*
> well-formed CIF.
>
>
>>  It comes up because in existing
>> validation suites for the line-folding protocol under CIF1, rather
>> than treating as an error, it uses it as a way to allow an
>> embedded \n; in a line-folded text field.  Inasmuch as we are
>> in agreement that \n;\ is not a syntactically valid termination
>> of a text field in CIF2 as defined in the change document, there
>> is no harm in those of us who use the construct under as a
>> non-conflicting extension to CIF1 to continue to do so under CIF2.
>
>
> That is a question of application design, not CIF syntax.  It is
> perilous to write files using that formalism, as some CIF processors
> would certainly reject them, but that's outside the scope of the spec.
> The spec merely defines that such files are not well-formed CIFs.  As
> for reading files that use it, I adapt an old saw from the Fortran
> community: if the file does not comply with the CIF specifications then
> a processor may do anything it wants with it, including starting World
> War III.  I do trust that most CIF readers will exercise greater
> restraint, however.
>
>
> Regards,
>
> John
>
> --
> John C. Bollinger, Ph.D.
> Department of Structural Biology
> St. Jude Children's Research Hospital
>
>
> Email Disclaimer:  www.stjude.org/emaildisclaimer
>
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group



--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.