Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

"imageNCIF" workshop

  • Subject: "imageNCIF" workshop
  • From: Andy Hammersley <hammersl@esrf.fr>
  • Date: Thu, 28 Aug 97 13:59:16 +0200

Hello Everybody,

   I've just joined the group so I'll start by briefly introducing
myself. I'm Andy Hammersley (hammersley@esrf.fr) and I work at the 
European Synchrotron Radiation Facility in Grenoble, as a scientific 
programmer. I'm the author of the FIT2D data analysis program, which 
some of you may know. My work covers many areas of X-ray science, 
including support for SAS experiments.

Frustration with the multitude of image data formats which keep
being invented (sometimes poorly), has led me to get involved
with a project to define an international data format for image
storage within the "crystallographic" community. ("Crystallographic"
should be interpreted in a very wide sense; e.g. the IUCr includes
the SAS commission.) This is being done in accord with the IUCr 
COMCIFS committee with the aim of IUCr recognition, and hopefully 
widespread acceptence.

John Barnes suggested that I post information on this initiative to
the SAS community. At the end of this message is an article (very
slightly adapted) which I originally wrote for the CCP-13 newsletter.
There is also the reference for the paper which describes the CIF
format and the original "data dictionary".

Much e-mail discussion has already taken place concerning the format,
but to bring ideas clearly together, finalise a first working draft,
and write interface software, a workshop is organised at Brookhaven 
National Laboratory on the 20th-22nd October. The first day is intended
to be a general open discussion day, whilst the others are intended to
be "working" sessions where detailed specifications and code will be
written. It would clearly be appropriate if the SAS community was
represented at least for the first day discussions.

Best Regards,

          Andy Hammersley


         "imageNCIF": An Initiative to Standardise Image Formats

                           A P Hammersley

           ESRF, BP 220, 38043 Grenoble Cedex, France.

                      E-mail: hammersley@esrf.fr


Data formats and in particular image formats are a problem ! Most scientists
will know the problem of having find or write a conversion program, to
convert a file in a particular format to a format which can be input to
the analysis program which they want to use. Similarly application programmers
will know the problem that however many formats they support, there
will always be new detectors with new data formats which will be

If a sufficiently versatile common format was widely adopted this
"Babylon" of formats would at least be limited and maybe the number
would eventually decrease.
Initiatives to define such formats are not new, but for storage of
large quantities of experimental data there has been no commonly adopted
format to date.
For the task of passing more limited quantities of processed, typically
structural, data amongst the crystallographic community, a common
standard format has been defined: "CIF" [1].
"CIF" stands for "Crystallographic Information File", and is an 
ASCII text based, flexible and extensible human readable archive file

In March 1995 a computing workshop was organised at the Brookhaven National
Laboratory concentrating on the use and design of "Graphical User
Interfaces" (GUI's). One subject area which was clearly of general 
common interest was image formats. 
A working group was formed and time was dedicated to
an open discussion on the requirements of such a format. It was here (to
my knowledge) that the idea of extending the CIF-concept to define the
header information for images was first discussed by a sizeable group.

What is CIF ?

CIF is a standard format maintained and ``owned'' by the International 
Union of Crystallography (IUCr) for archiving and transporting 
crystallographic data [1]. One important current use is the 
submission of papers on structure determinations to Acta
Crystallographica Section C.

The format consists of simple ASCII text keyword and keyword value
pairs. A large "dictionary" defines the keywords and possible values.
There is support for comments, for multi-line character text,
and for structuring separate data sections. The keywords are defined
from a hierarchical "class" (data name categories) and "sub-class"
system. e.g. All keywords which start with _diffrn refer to data from
diffraction or other experimental measurements. A typical line from a 
CIF file, which could also be relevant to an experimental image file is:

_diffrn_radiation_wavelength  0.76       # This is 16.3 keV

Importantly, the dictionary defines the precise meaning of the
data names, and the units and valid range when appropriate.

A number of software tools are available from the IUCr to work with
CIF's, and libraries are available to help read and write CIF's.
CIF is maintained and extended through the IUCr COMCIFS committee.
At present there are initiatives to extend CIF (Core
dictionary) to cover the extra needs of macromolecular crystallography
(mmCIF) and powder diffraction. 

For more information on CIF, there is a World-Wide-Web page with links
to associated pages (http://www.iucr.ac.uk/cif/home.html).

What is "imageNCIF" ?

"imageNCIF" ( = image (Not) CIF; cf. mmCIF), is an E-mail based  
working/discussion group working on the idea of extending the CIF concept to
cover the storage of experimental data, and in particular 2-D "image"
data. There are presently about 15 members of this group, which contains
representatives from commercial detector manufacturers, programmers of
data analysis software, members concerned with data acquisition
at user facilities, and crystallographers from a variety of different
scientific disciplines.

The aim is to standardise the passing of image and associated crystallographic 
experimental data from: one institute to another;
one make of computer system to another; and from one computer program 
(acquisition or analysis) to another. It is desirable that image file
contains the necessary associated experimental information to make data
processing as automatic as possible.

The basic aim is essentially the same as that of CIF, but the difference
in the quantities of data involved mean that there is 
agreement amongst the members of the group
that the ASCII encoding of image data is not appropriate. 
Given that it is highly desirable that header information and the image 
data are kept together in the same file this means that the format is
binary in nature, and cannot be considered as compliant CIF. Hence the
"Not" in "imageNCIF".

Nevertheless, the advantage of using the existing CIF structure for
naming and defining data names and definition is sufficiently important
to define an associated format which is closely related to CIF:
"CIF-compatible". The header section of such a file would contain CIF 
keyword / value pairs. This would differ from CIF in the manner
in which the "lines" were separated. A simple utility program could 
extract the header section and write it out in a true CIF form.
Similarly analysis programs might use some of the header information for
their own processing, ignore other items, and write an output CIF which 
contains their results plus most of the original CIF-style items.
A number of new CIF data names need to be defined.

The details of this approach are presently being discussed by
the working group.

Joining "imageNCIF"

So far there are no "rules" and "imageNCIF" seems to work reasonably
well on an open membership basis. I suggest that one representative per 
institute, or one per scientific group within an institute is 
a sensible manner in which to channel views whilst avoiding that the 
working group gets too big. Input from the SAS community would 
be valuable, either through individuals or through a representative 

1. S R Hall, F H Allen, I D Brown, Acta Cryst., A47, pp655-685, (1991)