Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: parser validation tools

  • Subject: Re: parser validation tools
  • From: Brian McMahon <bm@xxxxxxxx>
  • Date: Tue, 16 May 2000 13:35:22 +0100 (BST)
> > Of course, much also depends on what you mean when you say "parser."

Indeed. Here is an idea from the SGML world - is it worth looking at a CIF
equivalent? One of the standard tools in the SGML developers' community is
nsgmls, a parser/validator by James Clark. It performs a couple of
functions: 

   (1) It tokenises an SGML stream and outputs an isomorphic so-called ESIS
   stream. The ESIS stream is designed to be rather easier for other
   applications to parse. Among other things (because SGML is a rather
   complex metalanguage) it supplies closing tags for constructs where an
   optional directive in the DTD allows them to be implicit. You could imagine
   that in the rather simpler CIF world, it might be helpful to output tokens
   in the isomorphic stream declaring end-of-loop or end-of-data-block.

   (2) It validates against the document type definition (DTD) associated
   with that input file. In CIF, one would map that to validating against
   the dictionary or dictionaries associated with the CIF. nsgmls tries to
   be clever about parse and validation errors, reporting them and then
   recovering as far as possible so as to report other errors downstream.

Of course, behind the application lies a library (sp for the SGML example)
that is responsible for such things as rigorous tokenisation and parsing.
There is also a functional specification for the isomorphic (ESIS) stream.

There is at least one precedent in the CIF world for an isomorphic data
representation: Dave Stampf at the PDB a few years back created an
isomorphic format called zinc to allow easy use of Unix line-oriented
utilities on CIF data sets. As far as I recall it worked quite well at
the syntactic level for reasonably straightforward CIFs, though there were
some bugs that occasionally surfaced.

Would people on this list generally find such a tool useful, or is it not
worth the effort of development?

> Brian, do you know how close Nick is to having the revised BNF ready?

Nick has sent me a URL today, which I'll post as a separate thread.

Regards
Brian

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.