Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Result of concatenation operator vote

Dear Herbert:

It is clear from this discussion that the status of the line-folding protocol is unchanged in CIF2.  Both CIF1 and CIF2 are explicit about tokens being separated by whitespace, and both CIF1 and CIF2 do not allow backslashes to either post-elide or pre-elide the <eol><semicolon> digraph.  Likewise, those who wish to interpret the line-folding protocol contrary to the published standard for CIF1 and CIF2 will presumably continue to do so.

I would therefore be prepared to state that the line-folding protocol is unaffected by the CIF2 syntax changes when presenting CIF2 to COMCIFS, if that is any help.

If you wish to instead reopen the elide debate (http://www.iucr.org/__data/iucr/lists/ddlm-group/msg00331.html), with the commensurate time that will be required to reach a conclusion, then I suggest that you put the question of reopening the elide debate to a straw poll first, to find out if others are as concerned as you about the issue.  As you have argued forcefully on several occasions, we need to wrap this up soon, and discussing elides will not be quick.

I am not in favour of reopening the elide discussion, as we are likely to simply cover exactly the same ground as this time last year, and take just as long about it as well.  I strongly urge you to delay or abandon discussion of this issue until after CIF2 has been adopted by COMCIFS.

James.

(some comments inserted below)

On Thu, Oct 28, 2010 at 9:08 PM, Herbert J. Bernstein <yaya@bernstein-plus-sons.com> wrote:
Dear James,

 Nothing is impossible, but the fact remains that for CIF2, all the uses of the backslash that had been agreed to for CIF1 were explcitly rejected, largely at Nick's insistence that somehow we were diverging from STAR. The oddity you have pointed out in the "official" CIF1 syntax document is another case of Nick's insistence which in fact diverges from CIF1 practice, existing code and existing round-trip cases.  We have somehow over the years entered an absurd never-never land in which the official CIF documents say one thing, and the established practice on which people do real work is something very different.

As John B and I have pointed out, there is no use of backslash that is allowed in CIF1 and not allowed in CIF2.  What this group did reject is a use of backslash to elide <eol> at a syntactic level.   Eliding <eol> within a datavalue *after* tokenisation, as the data folding protocol does, is still perfectly legal in both CIF1 and CIF2.
 

 If you don't believe me, see the trip test at

http://www.iucr.org/resources/cif/software/ciftest2/ciftest_2.1/outs/ciffold/longtext_out.cif

which explicitly tests that the ;\ construct works for line folding.

This trip test contradicts the published CIF standards and is therefore incorrect.  Who do I submit the bug report to?

 The line folding protocol is an essential reality, especially to allow CIF to be used with Fortran.

I have no issue with the line folding protocol as published, in fact it quite elegantly solves legacy issues.

 The use of the required whitespace after everything except the last token in a CIF document is an essential reality in lexical scans of existing CIF documents.

 In the name of what is to me is an incomprensible adherence to a constantly changing and undocumented STAR standard has resulted in loss of functionality that is needed to keep current applications and current CIF datasets in use.

There is no reason to bring the STAR standard into this, changeable or otherwise.  It is the detailed formal grammar and descriptive text for CIF1 which does not envision use of backslash as you propose.

 Of course these issues can be resolved.  I keep accumulating fudges for CIFtbx and CBFlib to deal with them.  The problem is that, without any COMCIFS level agreement on what the preferred fudges are, there is no reason to expect that the files my code reads and writes will be compatible with the files that, say, your code reads and writes, or compatible with the files that, say, John Westbrook's code reads and writes, almost guaranteeing that CIF is going to degenerate even more than it has into multiple idiosyncratic dialects.  To me this seems to be the antithesis of the goal of the creation of COMCIFS -- which was, as its name says, to maintain the CIF _standard_.

Indeed, we *are* trying to maintain a *standard*, and so it is most unhelpful when members of the standards committee support and actively advocate behaviour which directly contradicts the published standard.

 I apologize for sounding so preachy and stuffy, but I really think it would be a good idea to resolve these issues in some commonly agreed manner and try to keep CIF as a common language, rather than heading further into multiple dialects.

On Thu, 28 Oct 2010, James Hester wrote:

Dear Herbert,

Thanks for your detailed response. I think I am failing to understand
something elementary.  I see nothing in your comments below to indicate why
it would be impossible to use the line-folding protocol to split long lines
over multiple lines, for arbitrary CIF2 text.  Additionally, I believe your
first example below would be a syntax error under both CIF1.1 and CIF2,
because the <eol><semicolon> sequence terminates the datavalue regardless of
following text.  In other words, I know of nothing in CIF1.1 to indicate
that <eol><semicolon> terminates the datavalue only if there is whitespace
or <EOF> after the <eol><semicolon>.

If I am incorrect in my thinking, I would appreciate a correction expressed
in terms of the formal CIF1.1 grammar.

James.

On Thu, Oct 28, 2010 at 2:27 PM, Herbert J. Bernstein
<yaya@bernstein-plus-sons.com> wrote:
     Dear James,

      The line folding protocol is in section 26 of

     http://www.iucr.org/resources/cif/spec/version1.1/semantics

     I tried to get agreement on continuing this use of the backslash
     and that was firmly and explicitly rejected, effectively
     removing the entire line folding protocol, which depends on it.
      Even if we restore the use of the backslash, there has been a
     significant change in the termination of a text field.  In CIF
     1.1, text field can only end with <eol>; followed either by
     whitespace or the end of a file, so the existing line folding
     protocol allows

     ;\
     this is an example of an embedded text field
     ;\

     an embedded text field
     ;\

     ;

     which is no longer valid under CIF2 because all quoted fields
     end on the first occurrence of their delimiter, and as stated in
     the new syntax document, "CIF2 keywords, data block headers,
     save frame headers, data names, and data values must all be
     separated from each other by whitespace. Whitespace not
     otherwise part of a CIF2 syntax element is significant only for
     this purpose.

     Reasoning: The CIF1 specification relies implicitly on the
     syntactic structure of the language to require whitespace
     separators between syntax elements. The CIF2 syntax no longer
     implicitly provides whitespace separators in some cases
     (notably, after most types
     of data values), therefore the requirement is now made
     explicit."

     So under CIF2, the use of the elide to shield the <eol>; is
     explcitly an error.

     It would be very nice to have the line folding back, either
     in the form of the use of the backslash, or by using the
     string concatenation operator.


Regards,
 Herbert

=====================================================
 Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
       Idle Hour Blvd, Oakdale, NY, 11769

                +1-631-244-3035
                yaya@dowling.edu
=====================================================

On Thu, 28 Oct 2010, James Hester wrote:

     I would be happy to indicate the status of the line
     folding protocol under
     the CIF2 draft when introducing the CIF2 draft to
     COMCIFS.  Perhaps you
     could write a few words in reply to this email giving a
     description of the
     status of the line folding protocol under CIF2, as I'm not
     sure why
     line-folding and CIF2 are incompatible.

     On Thu, Oct 28, 2010 at 11:09 AM, Herbert J. Bernstein
     <yaya@bernstein-plus-sons.com> wrote:
          Dear James,

           I don't mind if the approval of CIF2 has priority if
     the debate
          on that ends before debate on the concatenation
     operator, but
          imasmuch as either the concatenation operator or some
     other
          replacement for the line folding protocol is
     necessary before
          CIF2 can become a full replacement for CIF1, I would
     suggest
          that the matter be brought to COMCIFS at the same
     time
          and we see what happens.

           I would also like to bring the issue of how we
     transition
          imgCIF before COMCIFS.  That is anther area where
     CIF2 does
          not yet provide support.

           Regards,
             Herbert

          =====================================================
           Herbert J. Bernstein, Professor of Computer Science
            Dowling College, Kramer Science Center, KSC 121
                 Idle Hour Blvd, Oakdale, NY, 11769

                          +1-631-244-3035
                          yaya@dowling.edu
          =====================================================


     On Thu, 28 Oct 2010, James Hester wrote:

          My count is 2 in favour, 4 against, with Simon (whose
     vote
          doesn't appear to have come in)
          potentially making that 3 in favour and 4 against. 
     These
          are not entirely convincing numbers
          for either side. However, although the proponents of
     the
          concatenation operator are free to
          address COMCIFS on this question, a replay of this
     vote
          within COMCIFS would lead to at least 3
          opposed and at least one in favour, with Nick's
     opposition
          making it (at best) a 4-2 vote
          against.  So, I suggest that at this point we delay
     any
          further consideration of concatenation
          until COMCIFS has approved CIF2.

          In a subsequent email I will therefore put the
     current
          CIF2 spec to a DDLm group vote, and
          assuming it passes will present it to COMCIFS for
     final
          approval.
          --
          T +61 (02) 9717 9907
          F +61 (02) 9717 3145
          M +61 (04) 0249 4148


     _______________________________________________
     ddlm-group mailing list
     ddlm-group@iucr.org
     http://scripts.iucr.org/mailman/listinfo/ddlm-group




     --
     T +61 (02) 9717 9907
     F +61 (02) 9717 3145
     M +61 (04) 0249 4148


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148


_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group




--
T +61 (02) 9717 9907
F +61 (02) 9717 3145
M +61 (04) 0249 4148
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.