Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: Fine-tuning CIF dictionary regexes

  • Subject: RE: Fine-tuning CIF dictionary regexes
  • From: "Bollinger, John Clayton" <jobollin@xxxxxxxxxxx>
  • Date: Mon, 18 Apr 2005 10:09:24 -0500

Regarding these two specific REs from mm_cif:

> floating point numbers:
> '-?(([0-9]+)[.]?|([0-9]*[.][0-9]+))([(][0-9]+[)])?([eE][+-]?[0-9]+)?'

This RE does not appear to agree with the CIF 1.1 formal grammar, which
puts the standard uncertainty after the exponent rather than before it.
(See the productions for <Numeric>, <Number>, and <Float>.)  Which is

> symmetry operations
> '([1-9]|[1-9][0-9]|1[0-8][0-9]|19[0-2])(_[1-9][1-9][1-9])?'

I think it's overkill to use the pattern to so specifically restrict the
possible symop number.  Which numbers are actually valid in any
particular case (and to what specific operation they correspond) depends
on other data in the CIF.  Since there needs to be validation after the
match anyway, then, making the RE a bit looser would allow a processor
to recognize errors more specifically.  I might write the symop RE like
this: '[1-9][0-9]*(_[1-9]{3,3})?'.  (That also happens to remove the
alternation problem, though that was not my objective.)  That way, if I
accidentally write 244_555 instead of 24_555, a processor can tell me
"bad symop number" instead of "unrecognized token".


John Bollinger


John C. Bollinger, Ph.D.
Indiana University
Molecular Structure Center

cif-developers mailing list

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.