Discussion List Archives

[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Minutes of 2021 COMCIFS meeting

Committee for the Maintenance of the CIF Standard (COMCIFS)

Meeting Online 31 August 2021 12:00 UTC

Present: James Hester* (Chair), Herbert Bernstein*, John Bollinger*,
Saulius Grazulis, Mike Hoyland, James Kaduk, Brian McMahon*, Andrius
Merkys, Peter Murray-Rust, Brian Toby, Antanas Vaitkus, John
Westbrook*, Simon Westrip

* Voting members

1. COMCIFS report last triennium
--------------------------------

The triennial report for 2017-2021 (a four-year period on account of
the postponement of the IUCr XXV Congress) was circulated beforehand
and will be published as part of the Congress Report in Acta
Crystallographica Section A.

2. Any questions/discussion of report
-------------------------------------

There were no comments. Among the items noted were that submissions to
the wwPDB in CIF format were now mandatory, that work on the second
edition of International Tables Volume G was progressing, that
collaboration between COMCIFS and the NeXus International Advisory
Committee (NIAC) continued to be fruitful, and that there was contact
with European efforts around materials modelling (EMMO and OPTIMADE).

3. Actions arising from CommDat meeting 25th August
---------------------------------------------------

COMCIFS works closely with the IUCr Committee on Data (CommDat),
and noted the following activities discussed at the CommDat
post-Congress meeting.

The high-pressure community have been working for some time on a CIF
dictionary, and are keen to make progress with this.

The quantum crystallography community also sees a need to capture
details of non-spherical atomic form factors in a CIF format (as
evidenced by Dylan Jayatilaka's keynote talk at the Congress and a
presentation by Simon Grabowsky and Krzysztof Wozniak at the
pre-Congress CommDat workshop on chemical crystallography).

The Commission on Powder Diffraction held an online meeting after the
Congress to discuss further developments of the pdCIF dictionary and
applications. Their priority is to facilitate validation of powder
structure determinations submitted to IUCr journals, but they also see
a need to capture all details of serial refinements across ranges of
temperature and pressure; and the desirability of working with
instrument vendors to improve the capture of raw data direct from the
diffractometer to CIF.

The Commission on Electron Crystallography would also like to see CIF
dictionary extensions to capture special aspects of electron
diffraction. Many members of the Commission also work with nanoED, a
European academic consortium that has agreed placements at the Chester
office with two students or postdocs to learn about the formal
aspects of CIF.

It was also noted that links to CIF dictionaries, templates and
example files on the IUCr web site used ftp-based URIs. These were no
longer fetched by current web browsers. Brian McMahon (BM) undertook
to replace these by http(s)-based URLs, and would review the best way
to provide location information in the CIF dictionary register.

4. Revitalising COMCIFS
-----------------------

4.1 Discussion of ideas
-----------------------

While acknowledging that the mmCIF/PDBx family of dictionaries was
keeping track effectively with developments in biological
macromolecular structural science, James Hester (JH) reflected that
other areas under COMCIFS supervision had plateaued for much of the
past decade. He felt that there was a need for COMCIFS to be, and to
be seen to be, more productive in extending CIF standards to novel
techniques and scientific growth areas. In addition to the initiatives
discussed by CommDat, he gave neutron diffraction and the squeeze
algorithm as examples of areas where new definitions and procedures were
required. He did note that dictionaries were now being developed on
GitHub, permitting access to a larger group of people. His early
suggestion (on the COMCIFS email list) that it might be useful to have
a technical advisory committee separate from the dictionary content
developers had not found favour - it was generally felt that technical
implementation and domain knowledge should be intimately coupled. He
did identify the problem that the machinery of COMCIFS was opaque to
the outside world, and that some mechanism was needed to make it as
easy as possible for people to engage with COMCIFS activities when new
standards were desired. He invited further input on ways to revitalise
the work of the Committee.

Herbert Bernstein (HJB) argued that the most important step was to
provide clear step-by-step instructions (on the IUCr web or in print)
on how to go about creating a new dictionary. He also suggested
considering the ANSI/ISO model where standards creators are required to
review the status of the standard on a fixed cycle; but emphasised
that the engagement of stakeholders was more important than the
specific process adopted.

Peter Murray-Rust (PMR) considered the CIF dictionary effort
outstanding, and used it as a model in his current molecular and plant
science activities. He felt it was important to identify "hot spots"
where people were actively developing ontologies, and saw materials
science as an area where there was much current activity. He also
recommended getting CIF identifiers into WikiData, and has already
worked with Saulius Grazulis (SG) and Antanas Vaitkus (AV) on putting
COD identifiers there. This would put crystallography in front of many
more people through the distributed nature of WikiData. JH mentioned
the European Materials Modelling Council (EMMC) as a materials
science initiative that he had engaged with, and PMR referred to
the BIG-MAP (Battery Interface Genome – Materials Acceleration Platform)
and Material Genome projects, but pointed out that such initiatives
tended to flourish during a period of supported funding, but had
limited longevity.

SG reported that he had had very positive experiences with GitHub and
similar collaborative platforms in developing standards (OPTIMADE) and
in software development, and that the use of a familiar community
platform could allow IUCr to attract other groups with an interest in
developing standards to work within a common development environment.
[JH demonstrated some aspects of the existing COMCIFS GitHub
implementation, which allowed this sort of discussion and peer
review for the current suite of CIF dictionaries and their conversion
to DDLm.] If COMCIFS were seen to be inclusive towards new ideas,
that would encourage other groups to look to COMCIFS for authority,
consistency and expert guidance. JH agreed on the merits of GitHub,
but saw a need for documentation to help new users to use it to best
advantage. He also made the point that as GitHub used version control,
it was possible to allow experimentation by anyone interested in
contributing, thus creating a more inclusive and welcoming environment.

JH reviewed some ideas he had presented to the IUCr Executive Committee:

* Topic-focused virtual workshops
* IUCr contribution to conference/sabbatical for completion of CIF work
* Lightweight newsletter for mailing lists (e.g. quarterly, with details
   of recent GitHub activity)
* Training modules for dictionary authors
* Formalised and published governance procedures
* Commissions take responsibility for dictionaries

He now thought that simple guidelines and how-to's might usefully take
the place of unduly formalised governance procedures.

HJB cautioned against a system that relied solely on Commissions,
because of the danger of activities behind closed doors that could
stifle wider community input. JH argued that the Commission was at
least answerable to the Executive Committee. But there was general
agreement that any development activity should be carried out on a
platform such as GitHub, where openness was a feature and any
interested contributors from the wider community could provide
input. It was felt that COMCIFS would liaise with a sponsoring
Commission and would provide guidance on setting up a GitHub site
linked within the framework of COMCIFS activities.

BM pointed out that there was a historic diversity of dictionary-building
projects - some Commissions (Aperiodic Crystals, Magnetic Structures)
had proven very effective; other dictionaries (electron density,
topology) had arisen from focussed groups of individual
programmers. This suggested some ongoing flexibility in the
composition of dictionary working groups; but Commission liaison was
worthwhile where that seemed useful or appropriate; and the idea of a
common family of GitHub sites would give a more complete and
coherent picture of overall activities. He also suggested (1) that
journal editors should also be brought into working groups to design a
new table of mandatory experimental data items that would inform the
publication requirements to be met by a new dictionary; and (2) some
form of roadmap be published on the CIF website to show what areas of
structural science had existing CIF dictionaries, where these were
under revision or construction, and what areas were barren and had
potential for future ontology development.

John Westbrook (JW) agreed that publishers (and relevant repositories)
were key players who should be involved in new developments. Another
aspect of diversity was the variety of stakeholders who might have
different requirements as a new area of CIF was developed, and so it
was essential that any working groups sought to engage with and
represent all of these diverse (and sometimes conflicting)
requirements. He made the point that (from a repository viewpoint)
collecting the data was dependent on the software actually in use
within the community that was going to support the desired standards.

JW also emphasised that the experience of the mmCIF community was
that greatest productivity flowed from small or medium-sized working
groups able to work together on a regular basis. Asynchronous
(email-based) discussion was not very productive where complex
requirements needed to be understood and agreed. A model where people
can be brought together, focus on the immediate requirements, and
revisit progress on a timescale where enough time has been allowed to
develop code, but not so much that details get forgotten, has worked
effectively.

SG thought the idea of a lightweight newsletter for developers a good
one, but pointed out the potential usefulness of the the IUCr Newsletter
for periodically carrying news on data standardisation to a much wider
community. He also pointed out that email-led development could work
well, provided the discussion was led and directed by a skilled project
leader. Returning to the problems faced by groups new to CIF of
getting up to speed on how to create a dictionary, he recommended that
COMCIFS publish contact details of members who could act as mentors to
people coming freshly to the field.

Jim Kaduk (JK) reviewed the status of the powder dictionary. While the
immediate priority was to encourage more authors to submit structure
determinations based on powder to IUCr journals (through fine-tuning
of checkCIF procedures), he felt the pdCIF dictionary needed to be
reviewed so as better to handle CIFs with mixtures and/or quantum
mechanical calculations. It was also the case that parametric
experiments were much more important than when the pdCIF dictionary
was first published, and new experimental metadata were needed to
describe current practice. He volunteered to explore these issues. JH
suggested this could be the basis of a focussed workshop. JK indicated
that authors of the most widespread packages (GSAS, TOPAS, FullProf)
would be among the stakeholders that one needed to engage.

4.2 Next steps
--------------

JH would consider the discussion and circulate to the group a
summary of suggestions on moving forward.

BM would update ftp: based links on the IUCr website.

BM would think about a suitable visual indication of established
CIF ontologies, ones under development, and areas requiring new
standards.

5. Any other business
---------------------

None was raised.

The meeting concluded at 13:00 UTC.


Brian McMahon
Secretary


Reply to: [list | sender only]