Re: OMG proposal for macromolecular structure

  Philip Bourne
  Date: Tue, 19 Sep 2000 13:18:59 +0100 (BST)
Thanks Brian. With regard to other work I append here a recent
correspondence that we had with David that might be of interest to this
list. It was prepared by John Tate.


Date: Fri, 15 Sep 2000 11:18:49 -0700
John Tate
I. David Brown
Cc: Philip Bourne <bourne@sdsc.edu>
Subject: Re: Comcifs

Dear David,

Phil asked me to give you an overview of where we are with a dictionary
zfor describing representations of macromolecular structures. 

Firstly I should probably explain our motivation for designing such a
dictionary, and the requirements which we have for it. I work on the MICE
project here at SDSC, a project that aims to improve the tools available
for viewing and interacting with three-dimensional macromolecular
structures. There is an overview of the project on the MICE website at:


The core of MICE is an interactive molecular structure viewer, written
entirely in Java, which currently runs on several flavours of both Windows
and Unix. Currently MICE uses VRML as the format for describing molecular
structures but this has several limitations, not least of which is that
VRML cannot encapsulate low level information about the structure it
describes, such as individual atomic coordinates.

In the next version of MICE, we want to separate the atom-level (PDB) data
from the user-defined representation of those data, and to do this have
defined a language that lets a user describe how a certain structure
should be depicted in three-dimensions, in much the same way that a rasmol
or molscript script file would. Such a description of a structure must be
a truly three-dimensional description, so that the user can look at the
structure interactively from any viewpoint and query the underlying atomic
data. There is a rough overview of the features and requirements for the
scene description language at:


At the outset we had planned to use a CIF-style dictionary to define the
Molecular Scene Description Language (MSDL), since protein structure
information is increasingly available in mmCIF format and acceptance of
CIF by the community is growing. For the MICE project however, a
significant problem with using CIF is that there is still very little
software support for CIF generally, and, to the best of my knowledge,
absolutely no CIF software written in Java. This would make wide
deployment of a platform-independent CIF-based language difficult and
would mean that community acceptance of MSDL as a useable and useful
language would be slow at best.

Instead we have cast MSDL as an XML-based language, allowing us to cash in
on the wide ranging support for XML in the Internet community, the large
amount of XML-based software and the wide range of related APIs that are
already available. We have a draft version of the DTD (an older example
can be found on the MSDL webpage) and an accompanying example MSDL
document. Some time ago I built a Java application which parsed the XML
DTD, followed by an MSDL document, and constructed an internal
data-structure that could then be queried and manipulated using standard
Java methods. This application was in no sense complete, but it was a
useful testbed for developing the MSDL DTD and experimenting with
different ways of representing structure in an MSDL document.

The next version of MICE will have the ability to read and write MSDL
documents directly, and we have recently been working on designing an
architecture for the application that will allow us to do this. In
parallel with that development, we have been revisiting the design of the
DTD, to ensure that the layout of an MSDL document is amenable to several
different uses. This is why the version of the DTD on our website is
somewhat old and out of date - since it's still in a state of flux,
there's no real point in keeping up with the very latest versions of the
DTD and example XML.

I hope that's a useful summary of what we've been doing with the scene
description language. If you have any questions, or would like more detail
about MICE or MSDL, please feel free to contact me.

All the best,


On Tue, 19 Sep 2000, Brian McMahon wrote:

> Dear Phil
> Thanks for the additional attribution. Of course, I should have mentioned
> Doug (with whom I had a brief correspondence), and the plaudits go to
> all the co-workers in this area.
> I'm not really within the loop of this work; I'm also aware that the
> mmCIFers and friends have been working on a number of other initiatives
> with bioinformaticists in related areas. It may also be that other CIF
> groups are engaged in collaborative work across disciplines. I'm sure
> comcifs-l would be glad of occasional news of any such work in progress.
> In addition, David Brown would be happy to report any substantial
> developments in the IUCr Newsletter if he is given an appropriate feed,
> and we could examine ways of making successful collaborative projects known
> to a wider community.
> Best wishes
> Brian
On Tue, Sep 19, 2000 at 03:58:08AM +0100, Philip Bourne wrote:
> > Hi: It should be noted that this is a joint effort and that Doug Greer at
> > UCSD has been working on this for over a year now on behalf of the PDB.
> > 
> > Cheers../Phil

