A next-generation computing platform for electron microscopy at Imperial College

Lead Research Organisation: Imperial College London
Department Name: Life Sciences


In a cell, proteins are like tiny molecular machines carrying out complex biological functions. Many proteins work together in large assemblies that are difficult to study because they are dynamic. To see protein complexes in action, we use a method called cryo electron microscopy in which we take hundreds of thousands of protein snapshots, and with a computer, combine these images to create a three-dimensional model. Using these models, we can understand how proteins interact with binding partners to do their jobs inside cells.

Computing is an important part of generating structural models from electron microscopy pictures. Previously, researchers used individual computers at their desks to process data. This is no longer possible because of the large number of images used in these calculations and the complexity of the computer programmes involved. The computer gaming industry has driven technology advances in graphical computing which we use to process images of proteins. This proposal is for a computing platform that combines a high-performance file storage system with graphical computing resources to process cryo electron microscopy data at Imperial College London. This computing platform will allow us to solve more structures quicker and enable researchers use these data to inform new experiments faster.

This proposal will support structural biology research at Imperial that works toward understanding the rules of life. Many of the projects that directly benefit from this computing equipment are tackling BBSRC strategic challenges in sustainable agriculture, an integrated understanding of health, and data-driven biology. Half of the CoIs on this proposal are researching the fundamental biological processes underlying antimicrobial resistance mechanisms. Together new structural models generated using this computing platform will lay the foundation for the discovery of therapeutics to improve human health.

Technical Summary

Cryo electron microscopy (cryoEM) is powerful structural biology technique that provides mechanistic insight into biological processes, including cell biology, microbiology, and infection. Achieving high resolution structures of dynamic protein complexes requires large numbers of images and computationally intensive classification algorithms. Machine learning tools are now central to modern cryoEM image processing, model building and data analysis. While these tools have revolutionized what is possible using cryoEM, they require high powered GPU computing nodes and large-scale file storage systems to fully capitalize on recent advances in the field. Imperial College has access to our own Thermo Fisher Scientific Glacios microscope, and to Titan Krios instruments at eBIC and LonCEM. These sources are generating a petabyte of data per year. This torrent of data is testing the limits of the computing capacities of individual groups and restricting the entry into cryoEM of new groups.

We propose a next-generation computing platform for electron microscopy at Imperial College. The platform would be compatible with Imperial's centrally managed infrastructure, giving us high level integration with ultra-high bandwidth and connectivity. This proposal will support the purchase of 20 SR650v2 servers, each computer equipped with four A10 GPU cards. The proposed two petabytes of storage will use the Lenovo DSS-G storage system utilising the IBM Spectrum Scale parallel filesystem. The computing resources will enable us to process data with the state-of-the-art codes using Bayesian and deep learning algorithms.

The platform will be supported for 5 years and will enable us to take full advantage of the electron microscopes available. The platform will immediately and positively impact >£14m of active BBSRC-funded research and training activity and will ensure that Imperial College is world-class in cryoEM for years to come.


10 25 50