Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Result of concatenation operator vote

Just for the record - my vote is 'yes'.

In line with Herbert's view, it might also be worth considering that it is
impossible to specify a multiline data value that includes all of the CIF delimiters.
Granted this is an unlikely scenario, but to me this is a deficiency of the base syntax.
CIF2 provides an opportunity to deal with such issues at the syntax level, rather than
via semantics that are not part of the standard.

Cheers

Simon


From: Herbert J. Bernstein <yaya@bernstein-plus-sons.com>
To: Group finalising DDLm and associated dictionaries <ddlm-group@iucr.org>
Sent: Thursday, 28 October, 2010 11:08:36
Subject: Re: [ddlm-group] Result of concatenation operator vote

Dear James,

  Nothing is impossible, but the fact remains that for CIF2, all the uses of the backslash that had been agreed to for CIF1 were explcitly rejected, largely at Nick's insistence that somehow we were diverging from STAR. The oddity you have pointed out in the "official" CIF1 syntax document is another case of Nick's insistence which in fact diverges from CIF1 practice, existing code and existing round-trip cases.  We have somehow over the years entered an absurd never-never land in which the official CIF documents say one thing, and the established practice on which people do real work is something very different.

  If you don't believe me, see the trip test at

http://www.iucr.org/resources/cif/software/ciftest2/ciftest_2.1/outs/ciffold/longtext_out.cif

which explicitly tests that the ;\ construct works for line folding.

  The line folding protocol is an essential reality, especially to allow CIF to be used with Fortran.

  The use of the required whitespace after everything except the last token in a CIF document is an essential reality in lexical scans of existing CIF documents.

  In the name of what is to me is an incomprensible adherence to a constantly changing and undocumented STAR standard has resulted in loss of functionality that is needed to keep current applications and current CIF datasets in use.

  Of course these issues can be resolved.  I keep accumulating fudges for CIFtbx and CBFlib to deal with them.  The problem is that, without any COMCIFS level agreement on what the preferred fudges are, there is no reason to expect that the files my code reads and writes will be compatible with the files that, say, your code reads and writes, or compatible with the files that, say, John Westbrook's code reads and writes, almost guaranteeing that CIF is going to degenerate even more than it has into multiple idiosyncratic dialects.  To me this seems to be the antithesis of the goal of the creation of COMCIFS -- which was, as its name says, to maintain the CIF _standard_.

  I apologize for sounding so preachy and stuffy, but I really think it would be a good idea to resolve these issues in some commonly agreed manner and try to keep CIF as a common language, rather than heading further into multiple dialects.

  Regards,
    Herbert

=====================================================
Herbert J. Bernstein, Professor of Computer Science
  Dowling College, Kramer Science Center, KSC 121
        Idle Hour Blvd, Oakdale, NY, 11769

                +1-631-244-3035
                yaya@dowling.edu
=====================================================

On Thu, 28 Oct 2010, James Hester wrote:

> Dear Herbert,
>
> Thanks for your detailed response. I think I am failing to understand
> something elementary.  I see nothing in your comments below to indicate why
> it would be impossible to use the line-folding protocol to split long lines
> over multiple lines, for arbitrary CIF2 text.  Additionally, I believe your
> first example below would be a syntax error under both CIF1.1 and CIF2,
> because the <eol><semicolon> sequence terminates the datavalue regardless of
> following text.  In other words, I know of nothing in CIF1.1 to indicate
> that <eol><semicolon> terminates the datavalue only if there is whitespace
> or <EOF> after the <eol><semicolon>.
>
> If I am incorrect in my thinking, I would appreciate a correction expressed
> in terms of the formal CIF1.1 grammar.
>
> James.
>
> On Thu, Oct 28, 2010 at 2:27 PM, Herbert J. Bernstein
> <yaya@bernstein-plus-sons.com> wrote:
>      Dear James,
>
>       The line folding protocol is in section 26 of
>
>      http://www.iucr.org/resources/cif/spec/version1.1/semantics
>
>      I tried to get agreement on continuing this use of the backslash
>      and that was firmly and explicitly rejected, effectively
>      removing the entire line folding protocol, which depends on it.
>       Even if we restore the use of the backslash, there has been a
>      significant change in the termination of a text field.  In CIF
>      1.1, text field can only end with <eol>; followed either by
>      whitespace or the end of a file, so the existing line folding
>      protocol allows
>
>      ;\
>      this is an example of an embedded text field
>      ;\
>
>      an embedded text field
>      ;\
>
>      ;
>
>      which is no longer valid under CIF2 because all quoted fields
>      end on the first occurrence of their delimiter, and as stated in
>      the new syntax document, "CIF2 keywords, data block headers,
>      save frame headers, data names, and data values must all be
>      separated from each other by whitespace. Whitespace not
>      otherwise part of a CIF2 syntax element is significant only for
>      this purpose.
>
>      Reasoning: The CIF1 specification relies implicitly on the
>      syntactic structure of the language to require whitespace
>      separators between syntax elements. The CIF2 syntax no longer
>      implicitly provides whitespace separators in some cases
>      (notably, after most types
>      of data values), therefore the requirement is now made
>      explicit."
>
>      So under CIF2, the use of the elide to shield the <eol>; is
>      explcitly an error.
>
>      It would be very nice to have the line folding back, either
>      in the form of the use of the backslash, or by using the
>      string concatenation operator.
>
>
> Regards,
>  Herbert
>
> =====================================================
>  Herbert J. Bernstein, Professor of Computer Science
>   Dowling College, Kramer Science Center, KSC 121
>        Idle Hour Blvd, Oakdale, NY, 11769
>
>                 +1-631-244-3035
>                 yaya@dowling.edu
> =====================================================
>
> On Thu, 28 Oct 2010, James Hester wrote:
>
>      I would be happy to indicate the status of the line
>      folding protocol under
>      the CIF2 draft when introducing the CIF2 draft to
>      COMCIFS.  Perhaps you
>      could write a few words in reply to this email giving a
>      description of the
>      status of the line folding protocol under CIF2, as I'm not
>      sure why
>      line-folding and CIF2 are incompatible.
>
>      On Thu, Oct 28, 2010 at 11:09 AM, Herbert J. Bernstein
>      <yaya@bernstein-plus-sons.com> wrote:
>           Dear James,
>
>            I don't mind if the approval of CIF2 has priority if
>      the debate
>           on that ends before debate on the concatenation
>      operator, but
>           imasmuch as either the concatenation operator or some
>      other
>           replacement for the line folding protocol is
>      necessary before
>           CIF2 can become a full replacement for CIF1, I would
>      suggest
>           that the matter be brought to COMCIFS at the same
>      time
>           and we see what happens.
>
>            I would also like to bring the issue of how we
>      transition
>           imgCIF before COMCIFS.  That is anther area where
>      CIF2 does
>           not yet provide support.
>
>            Regards,
>              Herbert
>
>           =====================================================
>            Herbert J. Bernstein, Professor of Computer Science
>             Dowling College, Kramer Science Center, KSC 121
>                  Idle Hour Blvd, Oakdale, NY, 11769
>
>                           +1-631-244-3035
>                           yaya@dowling.edu
>           =====================================================
>
>
>      On Thu, 28 Oct 2010, James Hester wrote:
>
>           My count is 2 in favour, 4 against, with Simon (whose
>      vote
>           doesn't appear to have come in)
>           potentially making that 3 in favour and 4 against. 
>      These
>           are not entirely convincing numbers
>           for either side. However, although the proponents of
>      the
>           concatenation operator are free to
>           address COMCIFS on this question, a replay of this
>      vote
>           within COMCIFS would lead to at least 3
>           opposed and at least one in favour, with Nick's
>      opposition
>           making it (at best) a 4-2 vote
>           against.  So, I suggest that at this point we delay
>      any
>           further consideration of concatenation
>           until COMCIFS has approved CIF2.
>
>           In a subsequent email I will therefore put the
>      current
>           CIF2 spec to a DDLm group vote, and
>           assuming it passes will present it to COMCIFS for
>      final
>           approval.
>           --
>           T +61 (02) 9717 9907
>           F +61 (02) 9717 3145
>           M +61 (04) 0249 4148
>
>
>      _______________________________________________
>      ddlm-group mailing list
>      ddlm-group@iucr.org
>      http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>
>
>
>      --
>      T +61 (02) 9717 9907
>      F +61 (02) 9717 3145
>      M +61 (04) 0249 4148
>
>
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
>
>
>
>
> --
> T +61 (02) 9717 9907
> F +61 (02) 9717 3145
> M +61 (04) 0249 4148
>
>
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.