Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: [ddlm-group] Relationship asmong CIF2, STAR, CIF1 and Python

OK, the nature of my particular misunderstanding about the STAR/CIF
relationship that came to light in our offline discussions is roughly
the following:

CIF1 is essentially a proper subset of the STAR format published as:
  Hall, S. R. (1991). The STAR File: a new format for electronic data
      transfer and archiving. J. Chem. Inf. Comput. Sci. 31, 326-333;
  Hall, S. R. & Spadaccini, N. (1994). The STAR File: detailed
      specifications. J. Chem. Inf. Comput. Sci. 34, 505-508
and described in detail in Chapter 2.1 of International Tables Volume G.

This version of STAR is used in the molecular information file, also
documented in Volume G ("used" is probably overstating the case; the
only application I know that outputs MIF content is the CCDC, which
uses tokens from the MIF and CIF core dictionaries but ignores
saveframe pointers and nested loops to create files that are
syntactically perfectly valid CIFs). There is also nmrSTAR used
extensively by BioMagResBank that has supporting libraries and database
applications. Also some small-scale experiments in the botanical field
(Syd's association with FloraBase) and a couple of demonstrator
applications that, so far as I am aware, were never developed (e.g.
in quantum chemistry).

In prototyping dREL and a DDLm, Syd, Nick and Ian Castleden made
ad hoc changes to the STAR syntax to get a workable implementation.
(Since their prototyping engine used Jython, they achieved runtime
efficiencies by implementing changes that were practicable with
Python, echoes of which we're seeing and actively discussing today.
Whether their choice was farsighted or purely accidental I don't
know.) Let us call this ad hoc version STAR+1 - it was a set of
practical syntactic features that would be used mostly in dREL methods
but also in proto-DDLm dictionaries, proto-dREL and appropriately
modified data files to test the novel methods approach. Most of this
work dates back about 10 years. The syntactic changes were not
formally published - they were practical "work in progress", though ny
the end of this cycle it was conceivable that they could have been
systematised and written up as a proper "STAR+1".

Since COMCIFS took on the task of developing CIF2/DDLm for
crystallography (i.e. the work of this group), we have discussed and
agreed many further changes from the original STAR syntax, much of
this with active involvement from Nick. When, some time back, Nick
said (whether just to me or on the list I don't now remember) that
he was focussing on writing up for publication a revised STAR paper, I
took that to mean that he wanted to freeze the further modifications
that had been agreed to that point as a "STAR+2". From that point I
was reluctant to see CIF diverge further from the then-current syntax,
and was looking forward to Nick's preprint which would document
clearly what that was. I was mistaken - Nick's current project is to
write up "STAR+1", leaving open the prospect of further changes to
"STAR+2" as required.

Note that even "STAR+1" never existed - Nick's paper will be a
retrospective consolidation of one set of changes adopted for practical
prototyping. In the same way, "STAR+2" need not exist until we
actually have a satisfactory CIF2 format that we can retrofit -
if that's actually required - to a second-generation STAR complete
with saveframes and the rest. Such a "requirement", in my mind, would
have to do with an actual need to retain compatibility with those
other STAR applications (MIF, FloraBase etc.) that I mentioned before. 
Realistically, that's probably not going to happen.

I think that most people on this list have been much quicker than me
to see that demonstrably useful syntax changes should still be made
without undue conservatism. The result is that we have been pulling
together roughly in the same direction (not always *exactly* in the
same direction) and have made real progress.

I'm embarrassed by my misunderstanding, and were we to revisit some of
our discussions I might now take another view (but only "might").
But as I argue elsewhere I think we're better moving on to test the
consequences of the solutions we've agreed to adopt, and being open to
future revisions in the light of experience, rather than re-running
past hypotheticals.

Best wishes
Brian

On Thu, Jan 13, 2011 at 12:17:41PM -0500, Herbert J. Bernstein wrote:
> James has requeested that I formally send a message to this list
> about a matter discussed recently in independent email in order
> to ensure a record.  At first I declined to do so, but after
> reflection, I have decided to do as James has asked.
> 
> I have withdrawn my vote in COMCIFS in support of CIF2 going
> forward at this time.  I have done so because, after emails
> from Nick and Brian, it has become clear to me that I was
> making false assumptions about the relationship between
> CIF2 and STAR.  I believe that a zero-based discussion is
> now needed on what the relationship should be among CIF2,
> STAR, CIF1 and Python to best serve the interests
> of the crystallographic community.  I do not know what
> is best and do not know how long such a discussion may take.
> I leave it to James, Nick and Brian to decide if Nick's and
> Brian's messages should be posted on this list for the record.
> 
> =====================================================
>   Herbert J. Bernstein, Professor of Computer Science
>     Dowling College, Kramer Science Center, KSC 121
>          Idle Hour Blvd, Oakdale, NY, 11769
> 
>                   +1-631-244-3035
>                   yaya@dowling.edu
> =====================================================
> 
> _______________________________________________
> ddlm-group mailing list
> ddlm-group@iucr.org
> http://scripts.iucr.org/mailman/listinfo/ddlm-group
_______________________________________________
ddlm-group mailing list
ddlm-group@iucr.org
http://scripts.iucr.org/mailman/listinfo/ddlm-group

Reply to: [list | sender only]
International Union of Crystallography

Scientific Union Member of the International Science Council (admitted 1947). Member of CODATA, the ISC Committee on Data. Partner with UNESCO, the United Nations Educational, Scientific and Cultural Organization in the International Year of Crystallography 2014.

International Science Council Scientific Freedom Policy

The IUCr observes the basic policy of non-discrimination and affirms the right and freedom of scientists to associate in international scientific activity without regard to such factors as ethnic origin, religion, citizenship, language, political stance, gender, sex or age, in accordance with the Statutes of the International Council for Science.