Bookmark and Share

Interest in data flourishes at AsCA

Amy Sarjeant
Speakers and Chairs (left to right): Amy Sarjeant (Co-Chair), Stephen Burley, James Hester, Matthew Lightfoot, Janet Newman, Brian McMahon, Takeshi Kawabata and Genji Kurisu (Co-Chair).
Late last year I had the pleasure of traveling to Auckland, New Zealand, to attend the 2018 meeting of the Asian Crystallographic Association (AsCA). I’m sure this won’t be the only update about this excellent conference, but I wanted to take this opportunity to write a short summary of the session I co-chaired with Genji Kurisu of PDBj. Early on in 2018, we had been asked to chair a session entitled “Database developments, validation and data mining.” This was the first time a session devoted solely to databases was included as part of the program for the AsCA meeting series. As our community continues to tackle the issues surrounding data sharing under the FAIR principles (Findable, Accessible, Interoperable and Reusable), sessions like this will play an important role in outlining our response to these challenges. For our session, we were fortunate to be able to include a variety of talks from a truly international set of speakers. Topics ranged from understanding the nature of a dataset itself, to data validation and everything in between.

James Hester of ANSTO kicked off the session by posing the question “What is a dataset?” As crystallographers we use the word “dataset” constantly, but often without pondering its meaning. James walked us through an approach to mapping raw image data files to imgCIF format, thereby providing a protocol for systematic raw data archive and sharing. From there, Janet Newman of the CSIRO Collaborative Crystallisation Centre (C3) took us on a tour of C6 (Comparison of Crystallization Conditions @ C3). She highlighted some of the challenges in standardizing crystallization screen data and some of the tools her team is making available to analyse crystallization results.

The next two presentations provided updates and insights into the status of the Cambridge Structural Database and the PDB. Matthew Lightfoot of the CCDC detailed some of the work his team has been doing in understanding the quality of datasets deposited with CCDC and improving data validation through the deposition workflow. Stephen Burley of the RSCB PDB detailed the efforts of his team to identify and correct ligand refinements in PDB structures. Toward the end of his presentation, Stephen highlighted the impact of the PDB on drug approvals by the US Food and Drug Administration. Continuing with updates from the PDB, Takeshi Kawabata of the Institute for Protein Research, Osaka University, provided a look at what PDBj is doing with data from electron microscopy (EM) studies. Takeshi detailed the EMPIAR archive of raw 2D EM images as well as EM Navigator, which provides a user-friendly interface to the EMDB server.

Finally, the session concluded with a contribution from Brian McMahon of the IUCr entitled “The element of trust: validating and valuing crystallographic data.” Brian’s presentation touched on the utility of the CIF file not only as a means of sharing data but also as an enabler for data checking and validation. This talk underscored the importance of focusing on data-sharing practices and protocols, which stood at the heart of every presentation. Without standardization, we lack the ability to validate the data we are using. And without the validation, what can we truly say about the scientific conclusion we draw from these collections of data?

Sessions on data management and archive provide a forum for concerned researchers to share experiences and agree on standard practices. As members of the structural science community, we have a responsibility to provide the highest quality data achievable from our experimental studies. I would encourage everyone to look through the talks described above on the IUCr website to learn more about ongoing efforts throughout the crystallographic community.

10 April 2019