Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: CIF line limits

I strongly support Brian Toby's position on "backwards" compatibility.  We
are encouraging the community to build up a complex and expensive
infrastructure to support CIFs, not just whatever CIFs they individually
are working on, CIFs from databases and archives.  It is important that we
provide them with a smooth migration path which allows them to focus on
doing science with the data, rather than having to unexpectedly change a
lot of hardware and software to adapt to changes in the flow of data they
will see from the outside world without help.  This does not mean that we
can never change a specification -- but that we have to give due
consideration to inexpensive and efficient migration paths fro people
using, say, Fortran on VAXes.

Nick's mention of MS Word is instructive.  There is a large community of
Mac users who still (to this day) use MS Word 5 (myself included).  It
happens that Microsoft did a very bad job on MS Word 6 for the Mac -- it
required upgrades to more modern machines to work a reasonable speed.
Many people tried 6 and fell back, and many more just stayed with 5.  MS
Word 97/98 was much better, but even Microsoft realized that "backwards"
compatibility sometimes had to allow for keeping old software running with
new data, and provided translation modules to keep MS Word 5 users happy.
That does not mean change cannot happen.  I also have MS Word 98, and
value its features highly, but when I create a document for the widest use
in the community, I drop back to MS Word 5 to maximize portability.

I urge COMCIFS to be as realistic and supportive as Microsoft (!!!!!), and
to provide reasonable means for the existing CIF community to adapt to the
changes in specification we make, by including "backwards" compatibility
features (such as Brian McMahon's backslash line continuation, and
requiring that only the first "n" characters of tags be significant), and
by making demonstration of open source APIs for Fortran (and for C) a
prerequisite to adoption of a new specification.

Regards,
    Herbert 

=====================================================
****                BERNSTEIN + SONS
*   *       INFORMATION SYSTEMS CONSULTANTS
****     P.O. BOX 177, BELLPORT, NY 11713-0177
*   * ***
**** *            Herbert J. Bernstein
  *   ***     yaya@bernstein-plus-sons.com
 ***     *
  *   *** 1-631-286-1339    FAX: 1-631-286-1999
=====================================================

On Fri, 17 Nov 2000, Nick Spadaccini wrote:

> On Thu, 16 Nov 2000, Brian H. Toby wrote:
> 
> Let me preface this response with the following. The PDB is trying to get
> some CIFs out that are greater than 80 char lines. If this distribution is
> being held up, awaiting the results of this discussion list then I think
> we should do the following. Though we do not agree where to set a new line
> length limit or if at all, we are pretty much in agreement that 80 chars
> is too small. I think it would be in the PDBs interest to allow them to
> distribute their files and for COMCIFs to guarantee what they finally
> decide on will encompass what ever line length the PDB has created. That
> way they can get on with their job.  
> 
> Now in response to Brian T ....
> 
> > I know where Nick is coming from. Yes, FORTRAN is dead and gone... but
> 
> Now now Brian, I never said FORTRAN was dead and gone, nor do I wish it to
> be - even though John Backus (its author) has long apologised for creating
> the "monster that will not die"!
> 
> I think Fortran 9X goes some way toward making it more mainstream, however
> its IO still is arcane and I still think the strong adherence to the
> concept of a record is what is problematic - but that language is
> specified by the powers that be and that is the way it is.
>   
> > we are still running some of the country's best and most modern neutron
> > instruments on VAXes with software written in -- guess what language. We
> > are not alone in this. I suspect that at most a third of the world's
> > major neutron diffractometers have been upgraded to linux (or downgraded
> > to ...). The majority still run on VAXes.
> 
> Now VAXes! Those I would have guessed were dead and gone .... I'm wrong
> yet again!
> 
> > In reality, a longer length CIF record length in would probably have
> > little in the way of ramifications, but I mention the above to
> > demonstrate that the world of computers as seen by the informatics folks
> > may not reflect the range of hardware used by those of us in the
> > trenches. Thus, I think it very important to preserve backward
> > compatibility. If this means keeping an 80 character limit on data name
> > length, I urge those of you with votes to retain this requirement.
> 
> This is an interesting interpretation of "backward compatibility", Brian.
> Usually this term refers to the ability of updated versions of software to
> deal with legacy file structures. Eg the Word2000 application will read a
> Word5.0 document. What is suggested by many of us will be "backwards
> compatible" by definition, our software will read any existing CIF.
> 
> However your suggesting that mechanism be put into the file specification
> that makes it possible for legacy software to run with a hacked version of
> a new file. This is the same as saying the Word5.0 application should be
> able to read a Word2000 document. I guess we all have differing views of
> compatibility, backwards or forwards - but I still see what you suggest as
> an encouragement to not improve existing software to move with the times.
> This is always a danger, as people lock themselves into legacy code and
> then increasingly demand the file spec meet their needs, as opposed to the
> software meeting the needs of the current spec.
> 
> I re-iterate what I said in my previous mail, I support the concept of
> elided line termination to signal the parser a line wrap is required. But
> its purpose is stylistic, it can be applied to a line of any length (not
> necessarily just 80 chars - though it can be used for that purpose) and
> should not impose a limit on a data name length. In short all legacy
> software will have to be updated to deal with the elided line termination,
> I suggest it be simultaneously updated to deal with the "complete" new
> file specification, then the question of backwards compatibility is
> properly implemented.
> 
> Fortran lives, it lives I tell you! (ala the scene from Frankenstein).
> 
> cheers
> 
> Nick
> 
> --------------------------------
> Dr Nick Spadaccini
> Department of Computer Science              voice: +(61 8) 9380 3452
> University of Western Australia               fax: +(61 8) 9380 1089
> Nedlands, Perth,  WA  6907                 email: nick@cs.uwa.edu.au
> AUSTRALIA                        web: http://www.cs.uwa.edu.au/~nick
> 
> 
> 
> 
> 
> 


Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Council for Science (admitted 1947). Member of CODATA, the ICSU Committee on Data. Member of ICSTI, the International Council for Scientific and Technical Information. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

ICSU Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.