Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Parsing of comments in BNC definition

  • Subject: Re: Parsing of comments in BNC definition
  • From: Nick Spadaccini <nick@xxxxxxxxxxxxx>
  • Date: Thu, 23 Nov 2000 16:23:41 GMT
On Thu, 23 Nov 2000, James Hester wrote:

> To demonstrate, if I have a data block heading
> 
> data_block#53 _a_data_item
> 
> does the production
> 
> <data_heading>          ::= <DATA_> <non_blank_char>+
> 
> suck up the #53 as well (# is listed as a non_blank_char)?  Or does

Yes, that would be the intention of that production rule and as Brian says
in a followup mail that is what the "universal guidance" program Star_Base
returns (when it doesn't seg fault and dump core).

> <data>     ::= { <wspace>+ <data_name> <wspace>* <blank> <data_value_1> }
> 
> mean that #53... -> comment -> wspace mean that '#53 _a_data_item' can
> be considered as the whitespace before the data name?  This question
> applies in a number of other places as well, where '#' might occur as
> part of a data value.

With all of these things when you build your parser the order in which
production rules are specified has important meaning. In fact your parser
should be looking at a subset of rules when in context. As an example
data_blah matches on the rule for non_quoted_1_string (and type 2). In
fact any STAR keyword will. That is why these are excluded in English
prose. The way to do this in practice is to order your productions so that 
<data_heading> (and keyword productions) fire first. That is just a word
of warning concerning how these things need to be thoughtfully
implemented.

One quick solution would be to change the production to;

<comment>               ::= {<blank> | <terminate>} '#' <char>* 

This needs to be checked in light of the other productions. I have had a
quick look and think it is alright, but others should check the
implications of the change thoroughly. One problem may be that this change
causes another production to define a comment token as requiring 2xblanks
or 2xterminates.

Any thoughts?

cheers

Nick

--------------------------------
Dr Nick Spadaccini
Department of Computer Science              voice: +(61 8) 9380 3452
University of Western Australia               fax: +(61 8) 9380 1089
Nedlands, Perth,  WA  6907                 email: nick@cs.uwa.edu.au
AUSTRALIA                        web: http://www.cs.uwa.edu.au/~nick



Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.