Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CIF line limits

On Thu, 16 Nov 2000, I. David Brown wrote:

> 	However, life is never simple.  Bernstein pointed out that some
> platforms cannot handle unlimited lines and that some large formal limit
> was still needed (he proposed 200 characters).  McMahon was concerned to

Actually David, this discussion dates back to 1992, and the members of
discussion list need to be reminded that STAR (upon which CIF is
based) has never had any line length restriction nor any dataname length
restriction. The software built for STAR has always fulfilled the
requirements of this specification and it is hard to see why in Y2000
CIF *needs* any restrictions. So my vote (and since I am not a
voting member that doesn't say much) is that any restrictions be removed.

Herb's concerns are directed purely at Fortran programmers, and in
particular those that do not use the C library extensions. Fortran's IO
has always been arcane since it needs to think in terms of records rather
than a stream of bytes. However to restrict a universal file specification
because of the limitations of one programming language seems rather short
sighted. Quite frankly in terms of sound practice, extending the limit
from 80 to 200 is no different to extending it from 80 to 81! Eventually
someone's going to want a record that is 201 bytes. As a matter of
interest how many other file formats that people in this discussion list
deal with have line length restrictions? 

> introduce a formal way in which the longer lines could be broken down into
> shorter lines to allow existing software to handle the new types of file.  

The reasoning is a little fuzzy. There exists software which will not be
changed to meet the new specification. There exists files with greater
than 80 byte lines. So how is that software going to read this file? Since
the file isn't going to miraculously break itself down to shorter lines,
someone else will have to write a program that is able to read the file
properly, and hack it into a form so that existing software doesn't have
to updated. And the sum total of this is .....

>                                          Datanames are still effectively
> limited to 80 characters since there is no mechanism for them to be split
> between lines.  

You guessed it, another artificial limit needs to be introduced (or
extended)! Rather than write software that does the job properly, just so
it can write files for software that doesn't do its job properly let the
IUCr actively insist all software that claims to be CIF compatible adopt
the new specification. That way datanames need NOT be restricted to any
length.

>            All agreed also that there were good stylistic reasons for
> CIFs to observe the 80 character limit where possible, even if longer
> lines were approved, since wrap-around can be a problem for printers and
> screens.  

But Brian's suggestion of "eliding" line terminations does have stylistic
merits. However its purpose is purely stylistic and should not be an
encouragement for people to continue to write or support software that
does not meet the new CIF specification. Moreover the semantics of the
elided line terminations needs to be pinned down. For instance Unix shells
like csh, tcsh and tcl replace the elide digraph with a space, whereas zsh
and bash replace it with a null (which is Herb's favoured option if I
recall).

> 	I am therefore bringing forward two proposals for discussion:
> 
> 1. That the different CIF versions be numbered with 1.0 being the original
> CIF definition, and 1.1 the version that extends dataname lengths from 32
> to 80 characters.  These two versions currently exist but do not have
> version numbers.
> 
> 2. That version 2.0 increase the line limit from 80 characters to 200
> characters.

I can see no strong reason why there should be any limit imposed on line
lengths or dataname lengths. There isn't in STAR, and there isn't in many
other *modern* file standards I deal with.

cheers

Nick

--------------------------------
Dr Nick Spadaccini
Department of Computer Science              voice: +(61 8) 9380 3452
University of Western Australia               fax: +(61 8) 9380 1089
Nedlands, Perth,  WA  6907                 email: nick@cs.uwa.edu.au
AUSTRALIA                        web: http://www.cs.uwa.edu.au/~nick




Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.