Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Backus-Naur Form for CIF

The BNF as posted seems to have some divergence from the ACTA paper
defining CIF:

1.  In the paper it says under "The CIF restrictions to the STAR File
syntax are...":

  " ... All data names and block codes are case insensitive, i.e. _ABS and
_abs are treated identically."  

The usual approach used in Fortran of redefining a-z with productions of
the form a ::= "A"|"a" won't work here, since we need to preserve case
sensitivity for text.  In practice this would be fudged in the lexical
scanner, but, for clarity, I would suggest adding an explicit comment
explaining the case-insensitivity of data names and some productions of
the form:

  <DATA_>  ::=  {"D"|"d"} {"A"|"a"} {"T"|"t"} {"A"|"a"} "_"
  <LOOP_>  ::=  {"L"|"l"} {"O"|"o"} {"O"|"o"} {"P"|"p"} "_"

to use in place of the "data_" and "loop_" strings

2.  The production for <data_block> does not require any leading or
trailing whitespace, so that a <CIF_file> could consist of a
<data_heading> and a <data> item immediately followed by another
<data_heading>, etc.  I cannot seem to find where the productions
explicitly require whitespace between the data item and the second
data heading.  A similar problem seems to exist in the production for
loop values.  This would certainly be solved by implicit precedence
among the productions or by operation of the lexical scanner, but it would
best to have the BNF be unambiguous in the handling of whitespace.

3.  The paper speaks of blanks, but not of tabs and vertical tabs and
formfeeds.  Most systems will accept handle tabs reasonably.  Not all
systems can handle vertical tab or form feed.  Are we requiring all
CIF parsers to be able to handle more than blank and tab?

4.  The paper speaks of recognising a number, and gives a syntax for a
number (with and without an ESD).  Shouldn't this be in the BNF?

5.  The paper includes an example with use of "\" (e.g. 'Cu K\a' escapes
in text and character fields.  Shouldn't this escape mechanism be
mentioned in the BNF, at least in the comments.

6.  The BNF does not seem to break out the "." and "?" metacharacter data
values.  In real parsers, these are very important cases to distinguish.

  -- Herbert

****                BERNSTEIN + SONS
****     P.O. BOX 177, BELLPORT, NY 11713-0177
*   * ***
**** *            Herbert J. Bernstein
  *   ***     yaya@bernstein-plus-sons.com
 ***     *
  *   *** 1-631-286-1339    FAX: 1-631-286-1999

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.