📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

A next-generation computing platform for electron microscopy at Imperial College

Lead Research Organisation: Imperial College London
Department Name: Life Sciences

Abstract

In a cell, proteins are like tiny molecular machines carrying out complex biological functions. Many proteins work together in large assemblies that are difficult to study because they are dynamic. To see protein complexes in action, we use a method called cryo electron microscopy in which we take hundreds of thousands of protein snapshots, and with a computer, combine these images to create a three-dimensional model. Using these models, we can understand how proteins interact with binding partners to do their jobs inside cells.

Computing is an important part of generating structural models from electron microscopy pictures. Previously, researchers used individual computers at their desks to process data. This is no longer possible because of the large number of images used in these calculations and the complexity of the computer programmes involved. The computer gaming industry has driven technology advances in graphical computing which we use to process images of proteins. This proposal is for a computing platform that combines a high-performance file storage system with graphical computing resources to process cryo electron microscopy data at Imperial College London. This computing platform will allow us to solve more structures quicker and enable researchers use these data to inform new experiments faster.

This proposal will support structural biology research at Imperial that works toward understanding the rules of life. Many of the projects that directly benefit from this computing equipment are tackling BBSRC strategic challenges in sustainable agriculture, an integrated understanding of health, and data-driven biology. Half of the CoIs on this proposal are researching the fundamental biological processes underlying antimicrobial resistance mechanisms. Together new structural models generated using this computing platform will lay the foundation for the discovery of therapeutics to improve human health.

Technical Summary

Cryo electron microscopy (cryoEM) is powerful structural biology technique that provides mechanistic insight into biological processes, including cell biology, microbiology, and infection. Achieving high resolution structures of dynamic protein complexes requires large numbers of images and computationally intensive classification algorithms. Machine learning tools are now central to modern cryoEM image processing, model building and data analysis. While these tools have revolutionized what is possible using cryoEM, they require high powered GPU computing nodes and large-scale file storage systems to fully capitalize on recent advances in the field. Imperial College has access to our own Thermo Fisher Scientific Glacios microscope, and to Titan Krios instruments at eBIC and LonCEM. These sources are generating a petabyte of data per year. This torrent of data is testing the limits of the computing capacities of individual groups and restricting the entry into cryoEM of new groups.

We propose a next-generation computing platform for electron microscopy at Imperial College. The platform would be compatible with Imperial's centrally managed infrastructure, giving us high level integration with ultra-high bandwidth and connectivity. This proposal will support the purchase of 20 SR650v2 servers, each computer equipped with four A10 GPU cards. The proposed two petabytes of storage will use the Lenovo DSS-G storage system utilising the IBM Spectrum Scale parallel filesystem. The computing resources will enable us to process data with the state-of-the-art codes using Bayesian and deep learning algorithms.

The platform will be supported for 5 years and will enable us to take full advantage of the electron microscopes available. The platform will immediately and positively impact >£14m of active BBSRC-funded research and training activity and will ensure that Imperial College is world-class in cryoEM for years to come.

Publications

10 25 50
 
Description We have completed 1 year since installation of the computing platform. We have established a steering committee that meets quarterly and have a systems administrator who manages the platform. We have the SLURM Resource Manager integrated into a variety of software for single-particle and tomography applications. Our platform has 18 registered groups (PIs) and 80 individual user accounts from across Imperial College. We have established a variety of training scripts for our user community made available through a central website. We have 2 key research outputs from 2024 which include Biorxiv preprints https://doi.org/10.1101/2024.08.06.606606 and https://doi.org/10.1101/2024.12.30.630807 which are in revision at peer-reviewed journals. Associated with these manuscripts are 4 PDB depositions: 9I1R, 9EYS, 9I9L, 9HVC and 1 EMDB deposition: EMD-52431, and 1 EMPIAR dataset: EMPIAR-12559 which are on hold for publication at the time of this reporting period. We also have 1 follow on funding: BBSRC grant BB/Z516740/1.
Exploitation Route It is too early to understand how the outcomes have been taken forward. They have only just been made public on pre-print servers.
Sectors Agriculture

Food and Drink

Healthcare

Pharmaceuticals and Medical Biotechnology