Discussion List Archives

[Date Prev][Date Next][Date Index]

(73) msCIF, Community Review, Dictionary Working Group, data exchange

  • To: COMCIFS@iucr.ac.uk
  • Subject: (73) msCIF, Community Review, Dictionary Working Group, data exchange
  • From: bm
  • Date: Tue, 16 Sep 1997 16:52:06 +0100
Dear Colleagues

It was a great pleasure to see so many of you at the St Louis ACA meeting
a few weeks back. The CIF workshop there provided a useful snapshot of
CIF-related activities, and was a useful precursor to the planned mmCIF
software developers' workshop at Rutgers University in October (contact John
Westbrook, jwest@rutchem.rutgers.edu, for further information).

The present circular describes a number of topics that arose, directly or
indirectly, from discussions at St Louis. I'm sorry that I'm only now
getting round to them.


D70.1 Review of COMCIFS
-----------------------
I attach for your information a copy of the report sent to the IUCr
Executive Committee by the Chair of COMCIFS for consideration at their
meeting in Lisbon during the European crystallographic Meeting.

D>                   REPORT OF COMCIFS FOR 1996/7
D> 
D>      Many years of Comcifs work came to fruition during the present
D> year with the approval of three major dictionaries, the first since
D> the approval of the original core dictionary by the Executive
D> Committee in 1961.  On 1996/11/12 approval was given to an extended
D> version of the core dictionary that includes additions and
D> clarifications resulting from experience with Acta Cryst. C. 
D> Considering the revolutionary character of the changes in the
D> production of Acta Cryst. C in which cif played a central role, the
D> original version of the core dictionary has performed well, but we
D> needed to correct a few infelicities in the original dictionary as
D> well as adding extra items, some required as a result of changes in
D> the techniques of structure determination.
D> 
D>      Two major new dictionaries have also been approved, the
D> macromolecular dictionary was approved on 1997/6/8 and the powder
D> diffraction dictionary on 1997/7/9.  Both of these have been under
D> construction for several years and are the result of a great deal
D> of dedicated work by a number of people, particularly Paula
D> Fitzgerald and her team for the macromolecular dictionary and Brian
D> Toby for the powder diffraction dictionary.  It is our
D> understanding that these dictionaries, in addition to extending the
D> range of papers that can be submitted to Acta Cryst. C and D, will
D> be adopted as archival standards by the Protein Data Bank, the
D> Nucleic Acid Data Bank and the Powder Data File.
D> 
D>      There are other dictionaries in preparation dealing with
D> modulated structures, symmetry and diffuse scattering, and
D> discussions are underway with a group planning a standard for
D> reporting measurements from image plates.  For technical reasons
D> the latter group will probably not adopt the STAR format on which
D> cif is based, but they are anxious to ensure as much compatibility
D> with cif as possible. 
D> 
D>      In the future, Comcifs will spend more time approving
D> incremental changes to the three major existing dictionaries as
D> they are brought into use by journals and databases.  For this
D> reason, Comicfs is reviewing its current structure and mode of
D> operation.  We will be making recommendations to the Executive
D> Committee for changes during the coming year.
D> 
D>                     David Brown
D>                     Chair, Comcifs
D> 
D> 1997.7.15

I am not aware of any further formal submissions to David's request for a
review following those posted in circular 72.

However, as a result of some debate at the St Louis CIF workshop, some of us
(myself, John Westbrook and Herbert Bernstein) have formed a small working
group to example some technical issues (I'll describe the purpose of this
particular group below). The terms of reference we have set ourselves for
the current project are listed below. They follow somewhat David's proposed
model for Working Parties and Subcommittees, but in this instance both
functions have effectively been rolled into one. I present our self-assigned
terms of reference for the record; I think we can easily mould them to
David's proposed structure if that becomes the final one, but they might
equally replace or complement the proposal currently on the table (that is,
we might identify large-scale projects benefiting from the establishment of
a COMCIFS Subcommittee and Working Party, as against smaller task groups set
up within COMCIFS).

       Proposed informal terms of reference for COMCIFS WOrking Group

(1) A Working Group shall be commissioned by COMCIFS for the purpose of
    reporting on a designated problem or set of problems.

(2) The Working Group shall contain at least one voting member of COMCIFS.

(3) The formal membership of the Working Group is at the discretion of the
    chair of the Working Group. 

(4) The initial business of the Working Group shall normally be conducted by
    e-mail.  A mail alias shall be established at the Chester office, of
    the form comcifs-wg1, comcifs-wg2 etc, for group discussion.

(5) At the discretion of COMCIFS, the discussions of a Working Group may
    be published on the World Wide Web, and community input may be solicited
    on the basis of these discussions.

(6) The lifetime of a Working Group is normally restricted to the time
    needed to prepare and submit a report to COMCIFS, and on a schedule
    specified at the time of commissioning.

(7) COMCIFS may accept or reject the final report of a Working Group, or may
    request revisions. The Working Group shall be dissolved upon acceptance
    or rejection of a final revision.


New topics for discussion
=========================

D73.1  Modulated structures dictionary presented for review
-----------------------------------------------------------
Gotzon Madariaga has submitted a revised version of the modulated structures
dictionary for our consideration. According to our existing procedural
rules, COMCIFS should assess its general suitability for its intended
purpose before publishing it in draft for the community to comment on. David
has proposed that this be achieved through the formation of a COMCIFS
subcommittee, and in this case he has volunteered to chair the relevant
subcommittee. We may approach other members with a request to lend their
expertise to this job also. In the meantime, and for your general
information, I have anticipated the publication of the draft by constructing
a web page (http://www.iucr.ac.uk/iucr-top/cif/ms/index.html) which is not
yet linked into the main IUCr CIF page, but is nonetheless available
for our use.

D73.2  Community review
-----------------------
Helen Berman, who is chair of the IUCr Committee on Crystallographic
Databases and an auditor of the current COMCIFS mailing list (and whom I now
formally welcome in that role), has submitted to the Executive Committee a
report on Database Activities which includes a recommendation that all
CIF dictionaries under the purview of COMCIFS should receive open community
review following the model of the mmCIF review.

In fact, the COMCIFS review process (as published in circular 12) does
include provision for open community review (see the previous item, for
example). Draft dictionaries are posted on the IUCr public ftp server
and Web pages in advance of final approval, and in practice we have
tested some of the draft dictionaries (revised Core and powder) on
prospective authors to Acta. Nevertheless, the mmCIF model, of an open list
server and clear postings to the crystallography newsgroups, has served that
community well, and I think it is a good idea to encourage other dictionary
drafting committees to follow a similar model, where they consider it
beneficial. I think it should be possible to establish list server software
at Chester for use by small dictionary committees who do not have the
experience or resources to manage such a mechanism by themselves.

D73.3  Working Group on dictionary maintenance
----------------------------------------------
As mentioned above, John Westbrook, Herbert Bernstein and myself are
examining the issues involved in maintaining  the growing corpus of CIF
dictionaries. Among the questions to be debated are the protocols for
identifying, locating and merging multiple dictionaries (official and
local) for data file validation, the management of local data names, the
identification of default dictionaries, and resolution of apparent conflicts
in definitions. I shall be glad to hear from any other members of COMCIFS
who may wish to take an active part in these discussions. The objective is
to assemble draft recommendations before the October workshop at Rutgers.

The discussion so far may be reviewed at the URL
http://www.iucr.ac.uk/cif/comcifs/wg1 (this does not have any public links
to it as yet).

D73.4  Relationship with non-CIF data exchange standards
--------------------------------------------------------
A number of groups are working on data exchange standards for areas that
are related to crystallography, and have chosen the CIF/STAR model as their 
starting point. COMCIFS has no direct jurisdiction over their activities,
for their constituencies are not purely crystallographic. Nevertheless,
there is a requirement for some degree of interchange with the
crystallographic community, and groups using CIF or STAR are bound by the
IUCr STAR patent and by the copyright restrictions on CIF. The groups known
to us have been punctilious in seeking the necessary permissions, and in
trying to work with the CIF community. The two I have particularly in mind are
the nmrIF group represented on our mailing list by Eldon Ulrich, and the
imageNCIF/CBF group represented by Andy Hammersley. Each group wishes to
meet the needs of its community by non-CIF extensions. In the case of the
NMR community, a full STAR formulation is required with save frames and
nested loops to store and access the ensemble structures that are generated
from NMR experiments. The image community wish to embed binary image data in
CIF-like annotatory files. Both communities reject the restrictions of the
CIF standard; but both communities are carrying out useful work that could
benefit the evolution of CIF. There are many examples of crystallographic
data structures where the use of nested loops might be effective and
efficient; equally, it could be useful to embed binary graphics or
word-processed data within a publication CIF for Acta. There is at present
some measure of unease as to the best way of reconciling the not-quite-CIF
applications with the crystallographic standard that we are charged with
maintaining.

Herbert Bernstein has suggested that a fruitful approach might be to
issue a COMCIFS endorsement of data exchange standards developed by groups
such as those indicated, provided that their full realisation included a
set of public software tools for translating to a clean CIF format. That is,
an NMR STAR file describing an ensemble of structures and using save frames
and nested loops cannot form part of an mmCIF deposition with the PDB.
However, a CIF version of the NMR data set (with nested loops "unrolled" to
a flat tabulation and save frames expanded or moved to linked data blocks)
could be appended to an mmCIF data file to provide a fully CIF compliant
interchange file. Likewise, a working image file could be converted to a
pure-ASCII interchange CIF including uuencoded or binhex'd fragments.

The resulting CIFs might be very inefficient as working files within the
applications specific to each discipline, but they would permit free
exchange of data files across disciplines, and would be fully amenable to
manipulation by standard CIF software.

I would be interested in hearing reactions to this, from both policy and
technical viewpoints, especially from Eldon and Andy as representatives of
their respective communities.

---------
Regards
Brian