Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Backus-Naur Form for CIF

  • Subject: Re: Backus-Naur Form for CIF
  • From: Nick Spadaccini <nick@xxxxxxxxxxxxx>
  • Date: Tue, 10 Oct 2000 17:58:12 +0100 (BST)
On Wed, 4 Oct 2000, Brian McMahon wrote:

  [ ... Gumph from Herb and Nick deleted ... ]

> This is a nice example of "literalism", and good reason to have these details
> thrashed out formally. I had always taken the view that the "reserved words"
> data_ and loop_ were case-SENSITIVE; so data_foo and data_FOO were
> identical, DATA_foo was invalid. I'm happy to accept the new convention,
> although it does mean some code rewriting.

Nothing in star is case sensitive, including the reserved words. Even the
datanames aren't case sensitive and I believe this was carried over cif. I
have taken the liberty of updating the BNF accordingly.

 [ ... more from Herb and Nick deleted - concerned the error
   in the prodcution for a data block which allowed for NO whitespace
   between the last data value and the next data block.

> I prefer the exception at the end of the file (i.e. the second alternative).
> Could it be formalised by including an end-of-file token?
>    <data_block>   ::= <wspace>* <data_heading> <data>+ (<wspace>|<eof>)+

This will work, and needs only be defined here since it is the only place
the EOF can occur (unless it is an empty file). The definition of EOF
is of course wooly and I have just defined it as the operating system
dependent end-of-file marker. The BNF has been updated accordingly.

> If you permit <vt> and <ff> as allowed characters, you should treat them as
> white space. I've no objection to forbidding them if that's what is
> generally preferred. BUT I have a half-recollection that <ff> was introduced
> at some stage as a mandatory character in the header to image data in the
> crystallographic binary file (to stop "more" and other Unix pagers from
> writing binary data to screen). Is that still the case? And how were such
> embedded <ff>s to be handled in imgCIF?

I treat both <vt> and <ff> as "whitespaces" according to my productions.
However I differentiate them slightly I throw <vt> into <spaces> because I
see it as a very long tab which wraps to the next line. Most editors
don't actually do much with them except highlight is a single "funny"
character, so I interpret it as a space. 

I put <ff> in the <terminate> production because it iin effect breaks the
line and is akin to lots of <newlines>. A somewhat subjective view I'll
admit.

Please check out the revised BNF and begin the next cycle of discussions.


Nick

--------------------------------
Dr Nick Spadaccini
Department of Computer Science              voice: +(61 8) 9380 3452
University of Western Australia               fax: +(61 8) 9380 1089
Nedlands, Perth,  WA  6907                 email: nick@cs.uwa.edu.au
AUSTRALIA                        web: http://www.cs.uwa.edu.au/~nick




Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.