Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

CIF line limits

Dear Colleagues,

	As has been hinted in earlier messages, there has been a
private discussion underway about the need to remove the 80 character line
limit that currently exists in CIF.  I want to open that discussion to
the wider group that subscribes to the Comcifs discussion list, the list
on which this message is being distributed.

	The question of removing the 80 character line limit was raised
with the Dictionary Review Committee (Brown, McMahon and Westbrook) by the
Protein Data Bank (PDB) who find that they need longer lines for their CIF
release of the PDB.  The Dictionary Review Committee consulted with
Bernstein, Hall and Spadaccini and rapidly agreed that there was no reason
in principle why the line limit should not be increased (or removed
altogether), providing provision were made for the fact that some existing
software might need to be converted to handle longer lines.  At this point
it seemed that the matter could be brought to the Comcifs discussion list
for further airing and quickly approved.

	However, life is never simple.  Bernstein pointed out that some
platforms cannot handle unlimited lines and that some large formal limit
was still needed (he proposed 200 characters).  McMahon was concerned to
introduce a formal way in which the longer lines could be broken down into
shorter lines to allow existing software to handle the new types of file.  
Hall and McMahon separately considered that changing the line limit
provided a useful opportunity to make other changes, some of which will be
required by the new developments in DDL expected next year.  All agreed
that some form of versioning of CIF was needed with version 1.0 being the
original version of CIF and 1.1 being the version in which the 32
character limit on datanames was lifted.  Datanames are still effectively
limited to 80 characters since there is no mechanism for them to be split
between lines.  All agreed also that there were good stylistic reasons for
CIFs to observe the 80 character limit where possible, even if longer
lines were approved, since wrap-around can be a problem for printers and
screens.  There seemed also to be a consensus that CIF version 2.0 would
be the version in which the 80 character line limit is removed or
extended.  Beyond that there were many suggestions but little
agreement.

	The release of the CIF-PDB has become a matter of some urgency
and, as chair of Comcifs, I have approved the release of a version with
lines longer than 80 characters on the grounds that it was clear that the
discussion in the extended Dictionary Review Committee involved a majority
of the voting members of Comcifs and none of them had opposed the
principle of extending the line limit, the only concern was with the
details of its implementation.  However, it is important that we bring the
official description of CIF into line with the practice as soon as
possible.  This requires a discussion on the full Comcifs discussion list
before the voting members are asked to approve.  I am therefore
transferring the discussion from the extended Dictionary Review Committee
to the Comcifs discussion list where all the proposals can be openly
presented and argued.  Please add your comments to this thread (preferably
by using the 'reply' feature of your email and replying to the discussion
list).

	Briefly summarizing the discussion so far, in addition to adding
version numbers and removing or extending the 80 character line limit, it
is proposed to introduce a CIF version 1.2 that would retain the 80
character limit but would include a mechanism for recognising files with
longer lines that had been split to be compatible with the 80 character
line length.  In addition McMahon has proposed a number of other minor
changes, mostly growing out of experience with Acta Cryst. C.  A version
2.0 is proposed that would see the line limit either increased or removed.  
In addition Hall has proposed to include a number of features such as
save_frames that are part of STAR but not currently allowed in CIF.  
Since these additions may need to be more carefully considered than the
increase in the line limit, it has been suggested that we could work most
expeditiously by adding these features incrementally in versions 2.0 and
2.1, allowing the line limit change to be introduced quickly while we give
more careful consideration to the other changes, but this approach has
been strongly opposed on the grounds that programmers need to work with a
stable definition.

	I am therefore bringing forward two proposals for discussion:

1. That the different CIF versions be numbered with 1.0 being the original
CIF definition, and 1.1 the version that extends dataname lengths from 32
to 80 characters.  These two versions currently exist but do not have
version numbers.

2. That version 2.0 increase the line limit from 80 characters to 200
characters.

Others will wish to make additions, or to propose changes, to these
proposals and I will leave each of them to make their own case.  I will
monitor the discussion and do my best to bring it to a timely conclusion.

	The discussion is now open.

			David Brown
			(Chair of Comcifs)


*****************************************************
Dr.I.David Brown,  Professor Emeritus
Brockhouse Institute for Materials Research, 
McMaster University, Hamilton, Ontario, Canada
Tel: 1-(905)-525-9140 ext 24710
Fax: 1-(905)-521-2773
idbrown@mcmaster.ca
*****************************************************