Contamination or serendipity – doing the wrong thing by chance

Janet NewmanMark J. van Raaij

To contaminate – definition 2 in the Merriam-Webster (online) dictionary – is to ‘make unfit for use by the introduction of unwholesome or undesirable elements’, and a contaminant is then something that contaminates. As was pointed out in 2016 (Niedzialkowska et al., 2016), the process of X-ray crystallography is rich with opportunities for contamination, and this can result in the production of an unexpected structure. A ‘contaminant’ in structural biology is generally ‘undesirable’ rather than ‘unwholesome’, as very careful and thorough analysis is required to deconvolute the unexpected results when a contaminant progresses through all the involved process of expression, purification, crystallization and structure determination. Two such examples are presented in this issue – a structure of the E. coli terminator protein Rho (Fan & Rees, 2020), and the structure of a cyanate hydratase from a Serratia strain of bacteria (Pederzoli et al., 2020). In both papers, careful crystallography enabled the identification of the protein that resulted in crystals, in particular the use of rotation functions to identify unexpected non-crystallographic symmetry.

Given the development of tools to help identify contaminating proteins – Contabase and the associated ContaMiner (Hungler et al., 2016), and the more recent SIMBAD tool (Simpkin et al., 2018), why are contaminating proteins still an issue? One reason may be that the identity of purified proteins is not always checked by mass spectrometry or N-terminal sequencing and a band of the right size on a denaturing poly acrylamide gel is often taken to be sufficient evidence.

Another reason may be that novel contaminants are simply not presented as such; the authors embrace the serendipity and publish as if the structure was the intended goal – one of us has certainly been guilty of such an approach (Newman et al., 2013). More interesting is the idea that every expression of a heterologous protein stresses the host in different ways, which can lead to the upregulation of different proteins, and thus a different protein background, even when using the same, well utilized expression system. So protein structures that are apparently unfortunate errors might provide insight into the mysteries of heterologous expression.

Finally, crystals forming from a mixture where the contaminant made up a tiny fraction of the protein present may be surprising to many, and is something to keep an eye out for (Fan & Rees, 2020). The possibility of Serratia contaminating an E. coli expression strain (Pederzoli et al., 2020) was, for us at least, also an eye-opener – definitely something for all protein producers to be aware of and to try to prevent.


Fan, C. & Rees, D. C. (2020). Acta Cryst. F76, 436–443.

Hungler, A., Momin, A., Diederichs, K. & Arold, S. T. (2016). J. Appl. Cryst. 49, 2252–2258.

Newman, J., Seabrook, S., Surjadi, R., Williams, C. C., Lucent, D., Wilding, M., Scott, C. & Peat, T. S. (2013). PLoS One, 8, e58298.

Niedzialkowska, E., Gasiorowska, O., Handing, K. B., Majorek, K. A., Porebski, P. J., Shabalin, I. G., Zasadzinska, E., Cymborowski, M. & Minor, W. (2016). Protein Sci. 25, 720–733.

Pederzoli, R., Tarantino, D., Gourlay, L. J., Chaves-Sanjuan, A. & Bolognesi, M. (2020). Acta Cryst. F76, 392–397.

Simpkin, A. J., Simkovic, F., Thomas, J. M. H., Savko, M., Lebedev, A., Uski, V., Ballard, C., Wojdyr, M., Wu, R., Sanishvili, R., Xu, Y., Lisa, M.-N., Buschiazzo, A., Shepard, W., Rigden, D. J. & Keegan, R. M. (2018). Acta Cryst. D74, 595–605.


This article was originally published in Acta Cryst. (2020). F76, 391.


Find a compilation of publications on contaminant protein structures and resources in IUCr journals here.

26 September 2020

Copyright © - All Rights Reserved - International Union of Crystallography

The permanent URL for this article is