Crystallographic data

Workshop on MX raw image data formats, metadata and validation

Report

Saturday August 14 2021

Prague, Czech Republic

"Macromolecular crystallography (MX) is the dominant means of determining the three-dimensional structures of biological macromolecules. Over the last few decades, most MX data have been collected at synchrotron beamlines using a large number of different detectors produced by various manufacturers and taking advantage of various protocols and goniometry. These data came in their own formats, sometimes proprietary, sometimes open. The associated metadata rarely reached the degree of completeness required for data management according to Findability, Accessibility, Interoperability and Reusability (FAIR) principles. Efforts to reuse old data by other investigators or even by the original investigators some time later were often frustrated. In the culmination of an effort dating back more than two decades, a large portion of the research community concerned with High Data-Rate Macromolecular Crystallography (HDRMX) has now agreed to an updated specification of data and metadata for diffraction images produced at synchrotron light sources and X-ray free electron lasers (XFELs). This Gold Standard will facilitate processing of datasets independent of the facility at which they were collected and enable data archiving according to FAIR principles, with a particular focus on interoperability and reusability. This agreed standard builds on the NeXus/HDF5 NXmx application definition and the International Union of Crystallography (IUCr) imgCIF/CBF dictionary and is compatible with major data processing programs and pipelines.  Just as with the IUCr CBF/imgCIF standard from which it arose and to which it is tied, the NeXus/HDF5 NXmx Gold Standard application definition is intended to be applicable to all detectors used for crystallography, and all hardware and software developers in the field are encouraged to adopt and contribute to the standard." [A. Förster, H. J. Bernstein, A. Bhoemick, A. S. Brewster, S. Brockhauser, L. Geliso, D. R. Hall, F. Leonarski, V. Mariani, G. Santoni, C. Vonrhein, G. Winter (2020), "Gold Standard for Macromolecular Diffraction Data", IUCrJ, 7, 784-792]

This is a tutorial workshop presented by the developers of the Gold Standard to introduce the community to this important upgrade to the interoperability and reusability of macromolecular crystallographic data for both synchrotrons and XFELS,and to give the participants an opportunity to work with and comment on the Gold Standard. To widely share best practices for metadata recording, we encourage the participation of neutron and chemical crystallographers.

IUCr XXV for Prague, CZ included what had been planned as a hybrid Workshop on MX raw image data formats, metadata and validation on this date, but circumstances resulted in our converting it to a pure virtual e-meeting via Zoom on Saturday, 14 August 2021, from 9 am until 3 pm, Prague Time.

Reminder: The CommDat user forum is at https://forums.iucr.org

The source page for this workshop is maintained by Herbert J. Bernstein at http://www.medsbio.org/meetings/HDRMX_14Aug21.html 

Programme

Times are given in Central European Time (CET), British Standard Time (BST) and Eastern Daylight Time (EDT) for the benefit of participants in Europe, UK and the US East Coast.

Saturday 14 August 2021

09:00-09:10 (08:00-08:18, 03:00-03:15) Welcome, Introductions and Set up for zoom, test connections
09:15-09:45 (08:15-08:45, 03:15-03:45) Aaron Brewster (LBL) Using the Gold Standard for data archival at kilohertz speeds Presentation (26 MB) | Audio recording (37 MB)

[A. Brewster]

Aaron Brewster
LBL

abstract

(hide | hide all)
09:45-10:15 (08:45-09:15, 03:45-04:15) Herbert J. Bernstein (Ronin Institute) MX raw data formats and the Gold Standard Presentation (9.9 MB) | Audio recording (36 MB)

[H.J. Bernstein]

Herbert J. Bernstein
Ronin Institute

abstract

(hide | hide all)
10:15-10:45 (09:15-09:45, 04:15-04:45) Max Burian and Diego Gaemperle (Dectris) Stream2 and FileWriter2 Presentation (0.5 MB) | Audio recording (31 MB)

[Dectris]

Max Burian and Diego Gaemperle
Dectris

abstract

(hide | hide all)
10:45-11:15 (09:45-10:15, 04:45-05:15) Coffee
11:15-11:45 (10:15-10:45, 05:15-05:45) Filip Leonarski (PSI) Jungfraujoch: A Data Acquisition and On-the-fly Analysis System for High Data-Rate Macromolecular Crystallography Presentation (15 MB) | Audio recording (34 MB)

[F. Leonarski]

Filip Leonarski
PSI

abstract

(hide | hide all)
11:45-12:15 (10:45-11:15, 05:45-06:15) Natalie Johnson (CCDC) Synchrotron Data in the CSD Presentation (1.5 MB) | Audio recording (41 MB)

[N. Johnson]

Natalie Johnson
CCDC

abstract

(hide | hide all)
12:15-12:30 (11:15-11:30, 06:15-06:30) Daniel Eriksson (Australian Synchrotron) Facility Report Australian Synchrotron Presentation (1.1 MB) | Audio recording (10 MB)

[D. Eriksson]

Daniel Eriksson
Australian Synchrotron

abstract

(hide | hide all)
12:30-12:15 (111:30-11:45 06:30-06:45) HJB for Dale Kreitler (NSLS-II) Facility Report NSLS-II Presentation (n.n MB) | Audio recording (19 MB)

[D. Kreitler]

Dale Kreitler
NSLS-II

abstract

(hide | hide all)
12:45-13:15 (11:45-12:15, 06:45-07:15) Lunch/breakfast break
13:15-14:30 (12:15-13:30, 07:15-08:30)

Facility Reports
Oskar Aurelius
(MAX IV)

MAX IV facility update Video recording (39 MB)
Open Discussion Future Directions
What metadata are lacking at the moment?
What problems are we envisioning?
Audio recording (67 MB)