publCIF – free software to edit and preview a CIF for publication

New in this version

This beta release of publCIF introduces a major new feature:

  structure visualization using Jmol (www.jmol.org), implementing a 'local' version of the IUCr's online enhanced figure toolkit (jtkt)

When you launch Jmol (e.g. from the toolbar button ), you will be presented with an interactive visualization of the structure in your default internet browser. The interface will 'look and feel' like a web service, but no internet connection will be necessary – publCIF will act as the 'web server', managing the data transfer* between the open CIF and the Jmol 'web pages'.

In this way, publCIF now provides an excellent tool () for examining the structures in your CIF, as well as the opportunity to create 'enhanced' interactive figures, either for formal publication or for your own presentation purposes.


publCIF serves, stores and processes Jmol scripts to prepare the Jmol toolkit pages and present both static images and 'enhanced' figures.

Feedback from this beta release would be most welcome.

Getting started

If you are using publCIF for the first time, it is recommended that you open a trial CIF to familiarize yourself with publCIF's functions.

Important concepts to bear in mind are:

Introduction

Although the crystallographic information file (CIF) provides an excellent architecture for archiving and accessing crystallographic data electronically [International Tables for Crystallography (2005). Volume G, Definition and exchange of crystallographic data, edited by S. R. Hall and B. McMahon. Heidelberg: Springer] it is still perhaps rather 'unfriendly' with respect to human-readability and manual manipulation, e.g. for a non-crystallographer who wishes to publish a paper in a journal that only accepts CIF submissions of structure reports.

publCIF addresses this problem by offering the user a more familiar interface to the CIF through an HTML representation of the required publication data, although the raw CIF can still be viewed. It is hoped that publCIF will help the user to work with CIF syntax and structure, e.g. the translation of CIF symmetry codes such as 1_456 to x − 1, y, z + 1 can be seen in 'real time'.

CIF and Preprint

On opening a CIF, it will be displayed in the CIF window. If it is a small-molecule CIF, i.e. conforming to the CIF core dictionary (the powder extension is also supported), a Preprint will be generated and displayed in the Preprint window.

These two windows are linked and typing in either window will affect the other. This functionality can be switched on and off (see below). If making major changes to the CIF, or manually adding data items, it is advisable to break the link by setting the Preprint to ' read-only'.

The two windows offer the user two approaches to editing the CIF: the Preprint window can be used to write text sections and apply formatting (especially as publCIF supports the use of italic and bold tags in the main text fields), while the CIF window shows the entire CIF and provides access to dictionary data and checking functions for all data items. The link between the windows also assists in navigating the CIF (clicking in the Preprint window takes you to the appropriate item in the CIF window).

The CIF window

The CIF window displays the CIF as a standard ASCII file. No syntax highlighting is applied and no formatting buttons are provided, i.e. it is a basic plain-text editor.

The main CIF functions are available on a per-item basis from the right-mouse-button menu. If you click a data item in the CIF window then activate the right-mouse menu, you can access the dictionary information for that data item, including a list of permitted values if appropriate.

As publCIF supports the use of italic (<i>) and bold (<b>) tags in the main text fields of the CIF, if you need to include the literal string <i>, i.e. not intended to be a tag but representing e.g. the average of parameter 'i', it will be necessary to use the \\langle and \\rangle CIF entities: \\langle i\\rangle. publCIF performs this translation automatically if you type <i> in the Preprint.

Although the CIF standard allows line lengths of 256 characters, publCIF enforces the traditional 80 character cut-off for reasons of compatibility with existing CIF software. Long lines (i.e. strings of over 80 characters with no white space) are handled according to the mechanism described in International Tables for Crystallography, Volume G (Section 2.2.7.4.11), whereby the line is split and terminated with a backslash ("\"), and a backslash is added to the start of the text field. It is advisable to let publCIF apply this to the CIF automatically when typing in the Preprint; it is not applied automatically when typing in the CIF window.

If you wish to paste text into the CIF window from external sources, publCIF will attempt to convert any symbols to CIF format (e.g. α to \a), but any bold and italic formatting will be lost (however, the Preprint window allows pasting of formatted text - see below).

The Preprint window

The Preprint window is a basic word processor, enabling you to apply formatting to text using familiar buttons, etc. It also takes care of line wrapping (enforcing the 80 character line length in the CIF and applying the appropriate mechanism for lines over 80 characters - see above). The main text sections of the Preprint (title, abstract, references, etc.) can be edited in this way, the CIF being updated as you type.

Symbols and accents can be inserted into text sections in the Preprint window using the Insert symbol tool (see the Preprint menu).

publCIF supports the use of italic (<i>) and bold (<b>) tags in the main text fields of the CIF. Such formatting is best applied using the Preprint formatting buttons to ensure matching opening and closing tags. If you manually type <i> in the Preprint, it will not be interpreted as an italic tag, but translated to \\langle i\\rangle in the CIF.

A single mouse click in the Preprint window will highlight the corresponding data item in the CIF window.

Double-clicking in the Preprint window will activate a data input wizard where applicable (see below).

Standard items missing from the CIF will be flagged with a '?' in the Preprint window, enabling them to be noted and completed. A single click on such an item will activate a data input wizard (see below) for that data item.

An additional advantage of the use of the Preprint window is that it is possible to paste text copied from external sources such as Microsoft Word and other word processors and web browsers, and retain the formatting and character codes. The text will then be converted to CIF format through publCIF's 'update as you type' functionality. Note: this feature does not support the pasting of images, frames, or other non-text objects.

The Preprint can be exported as an HTML document for printing and review purposes (i.e. HTML is highly portable).

preprint button   Preprint button

This button generates a Preprint from the CIF:

xtl General preprint – showing all items for a 'small-molecule' CIF, including 'supplementary materials'

xtl Acta C preprint – essentially unchanged since previous versions

xtl Acta E preprint – including the new 'related_literature' item and providing tools to generate it

This version of publCIF also allows data comparison tables in the style of Acta Crystallographica Section B to be generated. Work on mmCIF 'templates' is in progress.

On opening a particular CIF for the first time, publCIF will apply autoformatting to the main text sections, e.g. adding italic tags to certain phrases, adding spaces before units (displayed as grey dots in the Preprint), and making certain stylistic changes appropriate for publication in IUCr journals. It will not do this again for that particular CIF unless this is specifically requested when generating the Preprint (choose Full formatting from the Preprint menu). The _audit_update_record data item is used to record that the CIF has been 'Formatted by publCIF'.

There are times when it is necessary to regenerate the Preprint:

Display modes   Display modes

The layout of the CIF window and the Preprint window can be changed by clicking this button. The CIF can appear on the left or right, or above or below the Preprint.

Synchronized navigation  Synchronized navigation  Navigation modes

By default the CIF and Preprint are linked and mutually updateable. This link is represented by showing cursors in the respective windows and each window will automatically scroll to the corresponding item shown in the other window (synchronized navigation). There are times when it may be desirable to stop the auto-scrolling but maintain the auto-updating, e.g. while working on the main text in the Preprint window but wishing to refer to the reference list in the CIF window. The second button (free navigation) allows you to toggle to this view mode.

Read only   Read-only preprint

Setting the Preprint window to 'read-only' allows you to work freely with the CIF without publCIF trying to update the Preprint. This may be necessary when manually changing the 'structure' of the CIF, e.g. changing a loop structure, pasting in chunks from other CIFs, combining multiple CIFs...

View preprint only   View preprint only

When writing the main text sections of the paper, it may be preferable to work with the Preprint window only. Toggling to this mode removes the need for visual updates of the CIF as you type and thus can improve the performance of the interface, especially if you are writing a long paper. The main text sections are updated when you leave this mode by e.g. toggling to the default synchronized navigation mode, or when you save the CIF.

CIF checking

publCIF provides extensive CIF syntax and data-validation routines, many of which are implemented as you type. Warning messages are written to the log file and in the status bar beneath the Preprint window. Click View log to display the log form (which can remain open while you work if you wish).

The main checking functions are described below.

Crystallographic analysis is beyond the scope of the program, although a few rudimentary checks are included (such as checking the cell setting with respect to the cell parameters). If submitting your paper to an IUCr journal, you should always pre-check it using the IUCr's online checkCIF facility. The Tools menu includes an Online checkCIF item for convenient uploading of the open CIF. Furthermore, if the resulting check report contains a Validation Reply Form, you can include it in your CIF by simply clicking a button. If you are unfamiliar with online checkCIF reports and Validation Reply Forms, please visit http://checkcif.iucr.org

CIF syntax checking  VCIF and publCIF checking

VCIF [a robust CIF syntax checking program (International Tables for Crystallography , Volume G, Section 5.3.2.1)] is included in the publCIF package and is run as an external program whenever a CIF is opened or closed, and whenever the Check CIF button is clicked.

In addition to parsing CIF syntax, publCIF performs dictionary compliance checks, i.e. that a data name is in the CIF dictionary, that the data value is of the type specified in the dictionary, and, if applicable, that it is in the specified range or matches a permitted value.

CIF dictionaries  CIF dictionaries

publCIF uses CIF dictionaries to lookup data names and check data values against permitted values, etc. This version of publCIF is supplied with three CIF dictionaries: cif_core.dic, cif_pd.dic and cif_mm.dic. The core dictionary is loaded by default. If the CIF contains powder data items, publCIF will also load cif_pd. If other dictionaries are required, they should be specified in an _audit_conform_(.)dict_name data item and the dictionary should be located in the 'dic' folder of the publCIF package.

publCIF incorporates a dictionary browser which can be accessed by pressing the CIF dictionary button. On opening, the dictionary browser will attempt to display the data item at the cursor position in the CIF window  and present the dictionary item as a formatted page. This page may also contain information on how publCIF processes the data item when preparing the Preprint.

Although this version of publCIF is optimized for small-molecule CIFs, the macromolecular CIF dictionary (cif_mm.dic) is included as it is possible to open such CIFs in publCIF. A Preprint 'template' for macromolecular CIF data is currently under development.

Reference checking  Reference checking

Of the reasons for revision of a paper after submission, reference problems are amongst the most common. With this in mind, publCIF provides a few reference handling tools for use when writing a paper. Check citations parses the reference list, attempting to identify the authors and journal names, and then examines the CIF to check that each reference has been cited.

A summary report is presented upon completion of these checks and any possible ambiguities are highlighted in the Preprint window. More details are written to the log file.

The Preprint is set to read-only at this point and many of the other functions are disabled while in reference checking mode. To leave this mode, regenerate the Preprint.

On clicking Check citations, you are given the option to let publCIF verify any references to IUCr journals by accessing the IUCr's online indexes. If a match is found, publCIF will ensure that the CIF contains the full reference as contained in the index (i.e. adding final page numbers, etc.).

Reference checking is under development. Future releases will implement enhanced reference checking, integrated with both local and online databases of citations.

Paper creation tools

A number of 'wizards' are available for entering and editing data. Most can be activated by double-clicking the Preprint window in the appropriate place (e.g. double-clicking the author section will show forms for entering the author details, double-clicking a table will activate a table editor, etc.).

Paper creation wizard   Paper creation wizard

The Paper creation wizard extracts information from the CIF and then presents the user with a series of forms, incorporating these data where available (e.g. if _chemical_name_common is found and there is no _publ_section_title, the chemical name will be offered as a suggested title for the paper). In this way, the wizard collects title and author details for inclusion in the CIF, and (optionally) enough information to write a template Abstract.

Edit authors   Edit authors

The authors can be edited using interactive forms. An added advantage of this approach is that the forms provide lists of names and addresses that have been used in previous CIFs. publCIF stores such details in a local text file (/xpublcif/user/myauthors.txt). This database can be edited using the Save authors tool (see below).

Save authors   Save authors

The Paper creation wizard stores new author details for future use in a simple text file. Save authors provides access to this local database of names and addresses, allowing you to remove records, etc., and to save the current authors if the wizard hasn't already done so.

Data input wizard   Data input wizard

The Data input wizard can be activated by double-clicking on a data value in the experimental tables displayed in the Preprint window. It provides a simple input box with a list of allowed values appropriate for the particular item, or a list of values entered previously using this wizard (stored in local text files), and checks the input against the dictionary, etc.

Table wizard   Table wizard

The standard geometry and hydrogen-bond tables generated from the CIF geometry data loops can be edited using a spreadsheet-type form by double-clicking on the table in the Preprint window, or using the right-mouse menu in the CIF window.

It is also possible to create extra tables in the CIF. The Extra table wizard provides a spreadsheet-type form with functions to copy, paste, remove and add cells, columns and rows, as well as import data copied from the standard table editors or from external sources (e.g. it is possible to import a table from most word processors).

Note: the _extra_table data names are not in the core dictionary, but are data items used locally by IUCr journals (other organizations may not recognize them).

Standard references   Standard references

A comprehensive list of standard references is available for searching and pasting into the CIF. These have been checked for accuracy and completeness and are in the format required for IUCr journals. The Standard references form provides a simple search tool to find a particular reference (e.g. searching for 'SHELX' will display a list of references for the SHELX family of software). In addition, a list of standard journal abbreviations is provided.

Troubleshooting

A version of publCIF has been in use at the IUCr's Editorial Offices since the beginning of 2005. This version is intended for use with CIFs describing small-molecule structures, to be published in IUCr journals. It will prepare a Preprint accordingly, regardless of the number of structures (data blocks) contained in the CIF. Bearing in mind that it may take of the order of 5 s per structure to generate the Preprint, large CIFs may take some time to open.

The following recommendations and known issues should be noted:

Please send any comments and report any bugs to support@iucr.org.

Specification

publCIF was designed for and on behalf of the IUCr by S. P. Westrip. This software was written using Qt Open Source Edition version 4.0.1 from Trolltech (www.trolltech.com). It is distributed under the terms of the GNU General Public License version 2 as published by the Free Software Foundation and appearing in the file LICENSE.GPL included in the packaging of this software.

This software is provided 'as is' with no warranty of any kind, including the warranty of design, merchantability and fitness for a particular purpose.