[Date Prev][Date Next][Date Index]
(75) mmCIF Workshop; imgCIF/CBF Workshop; review of COMCIFS
- To: COMCIFS@iucr.ac.uk
- Subject: (75) mmCIF Workshop; imgCIF/CBF Workshop; review of COMCIFS
- From: bm
- Date: Fri, 24 Oct 1997 13:16:23 +0100
Dear Colleagues It was good to see some of you again at the mmCIF and image workshops this past week. I would like to offer a brief synopsis of the workshops (from my viewpoint) and then make some comments on how I see them affecting the proposed COMCIFS review. People who were at the workshops but who see things differently are of course welcome to send in their view of things. D75.1 mmCIF Workshop -------------------- In many respects, this felt much more a CIF workshop than the earlier meetings in this series. While the York, Tarrytown and Brussels meetings had identified, analysed and ultimately resolved structural difficulties in the data model underlying the mmCIF dictionary, this meeting established an atmosphere of purposeful adoption and extension of the standard. The only disappointment was the lack of obvious progress in developing software tools beyond those so well established at the NDB and developed by Phil Bourne and Herbert Bernstein; but there was evidence that some of the crystallographic software authors are beginning to make suitable provision for implementing the new standard. The absolute need to manage future dictionary extensions effectively came up in a number of discussions, and there seemed to be some support for the establishment of a central register of local and extension dictionaries along the lines being developed by the Dictionary Maintenance Working Group. (Reminder: the discussions so far are at http://www.iucr.org/cif/comcifs/wg1/; we also made progress on the technical issues of dictionary merging.) Helen Berman, John Westbrook and their co-workers are to be congratulated on the smooth running of this meeting. D75.2 Image Workshop -------------------- Although I wasn't able to stay until the end of the image workshop at Brookhaven, there seemed to be very substantial progress in defining the goals of the project and establishing a working framework for their implementation. This workshop was particularly notable for its level of commercial sponsorship and support. Equipment manufacturers declared themselves ready to adopt and endorse a hard standard if one emerged. In that light, the meeting addressed two particular issues of interest to COMCIFS: the mechanism for handling binary data in a CIF framework, and the general structure of the data model for describing the relevant instrumentation. The solution to the binary/ASCII quandary was the following. The need to handle very large data sets at high efficiency (a requirement that could involve taking advantage of machine-dependent features) makes the adoption of an ASCII-only format unacceptable. Consequently, a new format file will be developed, called CBF (Crystallographic Binary File), which has certain header and blocksize properties appropriate to optimum handling on existing machine architectures. The contents of the file will be held in tag/data associative groupings that follow the CIF model, and the tags will be CIF ASCII tokens. The tokens will be defined in a standard DDL2 dictionary file, so that a dictionary of CIF terms suitable for full assimilation into other CIF data files may be constructed. This dictionary might be known as the imgCIF dictionary. Software will be made available to convert a CBF to a fully-compliant CIF, where the binary data streams (representing image data) will be suitably ASCII encoded (perhaps using MIME or some subset of MIME encoding as a publicly documented standard). The contents of the imgCIF dictionary will include descriptors of the binary data array, and descriptions of its size, ordering, dynamic compression and other technical details needed to rebuild the image. But there will also be a very rich set of instrumental descriptions, and the definition set drawn up for this purpose will greatly extend the current sparse DIFFRN_DETECTOR and *_SOURCE and *_MEASUREMENT categories. It seems essential to me that this work should not be held up because of concerns over the binary implementation. The binary file is not, and will not be called, CIF; but its content is fully describable in a CIF dictionary and it may be converted easily - if not trivially - to a fully compliant CIF when and if ever true archival or machine-independent transfer are required. The type of delay that followed early resistance to the DDL2-style extensions to mmCIF would be hurtful to this project, and to its full integration with mmCIF and coreCIF requirements for instrument characterization. So I would recommend that COMCIFS be prepared to endorse fully the imgCIF dictionary, and to consider whether an extension of its remit to cover and protect the CBF format is also appropriate. Note that I do not regard CBF as necessarily a general solution to the question of how to include binary information in the CIF framework. There are different issues involved in the incorporation of graphics file formats for publication, or generic multimedia annotations, that are probably best addressed separately from this application-specific approach. ========================== Based on the energy underlying both workshops, and on their complementarity yet diversity of aim, I am prompted to throw in a few suggestions of my own to the current review of the future of COMCIFS: D70.1 COMCIFS Review -------------------- Syd proposes a three-tier system of executive, project subcommittees and project working parties. I think this is on the right lines, but the inter-relationship between the project subcommittees and project working parties isn't completely clear to me. I think instead the three tiers that are necessary should be: (1) executive (2) dictionary subcommittees (3) technical working groups where (2) and (3) both report to (1). The distinction would be that dictionary subcommittees (2) shall have responsibility for the maintenance of dictionaries, and have essentially indefinite duration; but technical working groups (3) should be established to address specific questions posed by the executive and should have limited lifetimes. Only members of the executive would have voting rights; both dictionary subcommittees and technical working groups should have at least one member belonging to the executive. If you prefer more traditional bureaucratic nomenclature, (2) constitutes Standing Committees, (3) ad-hoc Committees. Membership: The executive committee should be small enough to be effective, large enough to bring viewpoints from across the discipline, and indeed from other disciplines (I have in mind that there should be at least one member who is knowledgeable about informatics). Six sounds about right. The members should be chosen for their technical expertise, and not ex officio in consequence of their standing on other IUCr bodies. Hence, I question the wisdom of Syd's suggestion that the Chair of the Database Committee should - ex officio - be COMCIFS Vice-Chair (I hasten to add that's a structural criticism, not a reference to any individual!). On the other hand, there are several bodies who feel that they have a right to representation in this forum, and they might have Observer status. Likely candidates would be the IUCr President, representatives of the Database Committee, Electronic Publishing Committee, Journals Commission, Nomenclature Commission. A case might be made to have an additional type of membership - call it Associate Member, perhaps - for the Chairs of subcommittees and technical working groups. Such associate members would have the responsibility for reporting their progress to the executive through the Full Member appointed to liaise with them. The executive should conduct open discussion through a mailing list (membership of which should be restricted to the executive and, perhaps, the Observers) and not through the current moderated discussion. In like manner, the subcommittees and technical working groups may conduct their business through separate mailing lists. It would be beneficial to manage all the mailing lists through Chester, or at least to mirror the discussions there. A coordinating secretary would assist the executive in formulating an agenda, in liaising between the various subcommittees and working groups, and in posting public notices and summaries at the direction of the executive. The coordinating secretary need not, of course, have executive rights. (Nor need it be the current incumbent. I feel it very much an honour and a privilege to serve in this role, but it's rather like the privilege accorded to the little unarmed drummer-boy of leading the regiment into the teeth of battle!) These are some of the matters that I think the executive would need to address in the near future: - The effective management of a distributed dictionary system - The responsibility for managing DDL and not solely CIF dictionaries - The continuing overhead (or not) of maintaining two DDLs - Related to that, the implications of permitting nested loops and other STAR constructs - Implementation of _type_construct and of methods - Identification of areas of crystallography still lacking a dictionary - Adoption of the CBF format as an IUCr standard - Management of binary attachments to archival CIFs Though many are issues of policy, they need to be thoroughly investigated technically before the policy is enunciated. For example, it is generally felt that nested loops would be A Good Thing; but it is essential, I think, to have working applications that can handle the greater complexity of nested loops before the current flat file representation can be abandoned. A technical working group would write a library for nested loop manipulation *before* it was adopted as a policy shift. In any case, several of these issues will come before COMCIFS in the near future, whatever its structure after review. With good wishes Brian
- Prev by Date: (74) Working practices, abbreviations in data names
- Next by Date: (76) COMCIFS review; imgcif workshop; core 2.1beta; towards mmCIF 2.0
- Index(es):