Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

RE: The CIF BNF

  • Subject: RE: The CIF BNF
  • From: Nick Spadaccini <nick@xxxxxxxxxxxxx>
  • Date: Mon, 25 Sep 2000 03:06:57 +0100 (BST)
On Thu, 21 Sep 2000, Bollinger, John Clayton wrote:

> Well, right now the 80-character limit _is_ an intrinsic part of the
> definition of a CIF.  If one just modifies the BNF then the result
> no longer describes a CIF -- one must first persuade COMCIFS to alter
> course on the CIF specification.  My impression is that COMCIFS has
> historically resisted suggestions to modify the limit.

John, you take me back to circa 1991-92 when I first questioned this 80
character limit with the proponents on the CIF format. I have over the
years revisited this issue with the usual response that "the
crystallographers" are happier with this limit. Now that the
crystallograhers are questioning this limit perhaps you will achieve its
removal.

There were many reasons why CIF is CIF and not STAR. The ability to
convince people of the usefulness of this file format was predicated on
selling its simplicity. For that reason there were STAR features dropped
from the CIF specification. These were, global blocks, save frames and
nested loops. The semantics of save frame blocks were ill defined and
considered application specific, and nested loops are actually neat when
laid out in a file but a little messy when it comes to writing programs
for access and manipulation. Certainly dropping these features made CIF a
lot simpler and one can iunderstand the position taken by the CIF
proponents. However there were two arbitrary character limits introduced,
the 32 character data name limit (the "whom in their right mind would want
a data name longer than 32 characters" argument), and the 80 character
record limit (the "people need static arrays" argument, and the "emailers
have an 80 character record limit anyway" argument).

Of course the 32 character data name limit has since vanished when the
"first people not in their right mind" were the mmCIFer's [sorry John,
Helen et al :)].

> As for the 80-character limit itself, I should like to see it lifted
> entirely.  I very well appreciate that that would cause problems for
> Fortran programmers trying to deal with CIFs, but that argument is like
> the tail trying to wag the dog.  I also recognize that the 80-character

I recall the early arguments about fixed array lengths were dominated by
fortraners but to be fair we should be clear that they were (and probably
still are) thinking in terms of Fortran 77 (possibly Fortran IV!). However
BEFORE these very arguments were taking place the Fortran 90 spec already
supported dynamic memory allocation. Hence any restriction wasn't really a
Fortran issue.

Robin in his follow up to this thread mentions that a "more common"
restriction would be 255 characters. Why 80 was chosen may be lost in the
sands of time, but punch card record lengths come to mind as does the
notion that "emailers had the same restriction" - though I have talked to
several about this notion and all have little recollection of mail
handlers in the early 90's having such a restriction.

However Robin does close his mail with the statement that a restriciton
does give considerable practical advantage. I really can't see why that
would be the case, since one can find a file that will break the
restriction. Why not completely generalise your application?

> limit can make CIFs more readable in certain display and printing
> environments, but few people are limited to such environments.  On the
> other hand, the limit is completely artificial, in that it is in no way
> driven by the content of the file; and it introduces unnecessary
> software compatibility issues, in that the handling of violations of
> the limit is not defined.  Removing the limit would make it easier to
> adapt STAR parsers for CIF and vice versa.  Why retain it in any form?

Yes, by deleting a few productions in the STAR Flex/Bison systems you
will have CIF parsers pre-built, or you can leave the parsers intact and
simply raise/throw exceptions when the CIF restricitons are violated.

WRT the 80 character limit you can guess what I would vote (if I had a
vote).

cheers

Nick

--------------------------------
Dr Nick Spadaccini
Department of Computer Science              voice: +(61 8) 9380 3452
University of Western Australia               fax: +(61 8) 9380 1089
Nedlands, Perth,  WA  6907                 email: nick@cs.uwa.edu.au
AUSTRALIA                        web: http://www.cs.uwa.edu.au/~nick





Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.