Flexible-body refinement for Cryogenic Electron Microscopy Applications

Lead Research Organisation: University of York
Department Name: Chemistry

Abstract

Scientists are interested in the atomic structure of biological molecules, in other words what the molecules look like. Knowing in detail what a molecule looks like provides important clues to how it might work. If we can go further and capture molecules in the process of interacting with other biological molecules, or artificial compounds such as drugs, we get a clearer picture of how they work.

Most of our knowledge of the structure of biological molecules comes from X-ray crystallography. However over the past decade a new technique, electron microscopy (EM) has become popular. Individual molecules held in a thin film of liquid solvent are frozen and placed in an electron microscope, which captures images of the molecules. Many individual views can be combined to construct a model of the structure of the molecule in 3 dimensions. Electron microscopy has developed rapidly over the past decades due to new kinds of electron detector and new software methods, leading to a 'resolution revolution' enabling a much greater understanding of the molecules.

In the most common cases, images of molecules are 'fuzzy' enough that we can't see individual atoms. The EM user therefore needs to have some knowledge of the structure of the molecule, or at least parts of it, in advance. This prior knowledge may come from other techniques, such as X-ray crystallography or computational modelling. The prior models can then be fitted into the EM image to give an indication of the whole structure, and allowed large molecular machines such as the Ribosome to be understood.

The prior model is generally only a poor match for the true structure, either because it came from a different species, or because it was distorted by crystallisation, or because of limitations in the computational modelling process. The model must therefore be adjusted in order to fit into the observed EM images. This is performed using both automated software such as Flex-EM which breaks the structure into successively smaller fragments and adjusts their positions to fit the density, and by time consuming manual modelling using 3D graphics.

The aim of this project is to take an existing method called 'shift field refinement' for distorting one 3D image to better fit another, and apply it to several problems in the determination of molecular structures from EM images. The method was developed by Professor Cowtan for problems in X-ray crystallography, but is sufficiently general to apply to other problems. The first problem we will address is fitting a known molecular structure into a 3D EM image. Rather than breaking the model up into smaller fragments which are each fitted separately, shift field refinement can very rapidly determine smooth deformations of the model which improve its fit to the image.

We will also look at the problem of improving EM images of flexible molecules. In this case, the 3D EM image is blurred because it is combined from 2D images of thousands of particles, with each particle being slightly different. We will improve the 3D particle image by averaging together smaller clusters of more similar particles, and then using shift field refinement to adjust for the differences between the clusters before averaging them together to produce final 3D images.

The project involves adding new steps to existing computer software for these problems and implementing new methods in a way which can be easily integrated with the existing software. We are working with existing software tools, including Flex-EM, rather than developing a suite of software from scratch to reduce the cost of the project, improve its chances of success, and to exploit the best features of the new and existing methods.

All of the software produced by the project will be distributed freely to academic users through existing software suites for electron microscopy. The source code for software will also be distributed so that other developers can learn from it or modify it.

Technical Summary

The aim of the project is to address three timely problems in the rapidly growing field of Cryo-EM data processing using a new family computational techniques developed at York. 'Shift field refinement' is a set of methods for optimizing the agreement between a pair of 3D (or 2D) images by applying a smooth deformation to one of the images, where the smoothness of the deformation can be adjusted to reflect the level of information in or the resolution of the images.

The first problem we will address is the fitting of existing atomic models into 3D maps from Cryo-EM experiments. A number of tools exist for this problem, including tools in the ChimeraX graphical software and the purpose-built Flex-EM package. Flex-EM works by dividing the atomic model into rigid bodies, while shift field refinement allows smooth deformation of the model, so we expect the methods to be complementary both across different structures and regions of individual structures. We therefore aim to implement shift field refinement as an extension to Flex-EM and work out how to optimally combine the two approaches by testing across a range of solved structures.

The second problem we will address arises from the fact that EM models of a molecule arise from averaging over tens or hundreds of thousands of particles which have been imaged by the microscope. In practice, large molecules are generally at least somewhat flexible, leading to 'blurring' at the edges of the final 3D image. We will address this problem by using existing tools to determine maps for smaller clusters of particles in which the molecules are in similar conformations, and then use shift-field refinement to account for differences between the particles before combining them.

Finally, we will perform preliminary investigations of the use of shift-field refinement to model motion of molecules between successive images from the microscope. This step is also important for reducing blurriness in EM maps.

Planned Impact

Cryo-electron microscopy has progressed over the last decade from being a niche technique for the low resolution imaging of large complexes, to a comparatively routine technique suitable for solving most medium to large structures. The resolution of the best EM reconstructions are now sufficient for de novo building. The UK now boasts a number of large EM facilities, for example at the MRC, Diamond and Leeds.

While the capital cost of an EM facility is large, the method offers substantial benefits for certain classes of problem. Unlike X-ray crystallography, the sample does not need to be crystallized - a time consuming and sometimes unsuccessful step. The challenge of crystallisation introduces a risk in the crystallographic pathway which carries its own cost. Consequently EM will see increasing use in the biotech and pharmaceutical industries. EM methods have the further benefit of imaging molecules in a state which is undistorted by crystal contacts, and thus in some cases more informative for biological problems.

The CCP4 project was established to provide software for X-ray crystallography and has been very successful in serving the biotech and pharmaceutical industries, as evidenced by over a hundred annual software licenses issued to industrial customers, raising typically £1m per annum in income. CCP-EM seeks to fill the same role for EM users. Professor Cowtan's work has contributed significantly to the success of CCP4, with his contributions to density modification, model building, visualisation and supporting infrastructure attracting over 10,000 citations in the peer-reviewed literature, as well as being cited in patents.

The development of improved model fitting and refinement software specialised to the interpretation of EM maps, and their contribution to the CCP-EM software suite for EM structure solution will make the method more accessible to non-specialist users, reduce the time spent on the task by specialist users and improve the quality of the structures in the EM database. The improvement of 3D EM maps will improve the resolution of EM images, in particular in the peripheral regions which are of most interest for drug binding studies.

The direct benefits to industry are expected, by parallel with X-ray crystallography, to be realised through the development of new drugs and biochemical processes, building on the insights arising from the structures determined by these methods. However, as with X-ray crystallography, we expect that linking individual products to software developments will be difficult due to the closed nature of the sector. The primary indicator of impact will remain the license fees which industrial users are willing to pay for the software.

Finally, the theoretical work which underlies this proposal will improve our understanding of the features of EM electron density reconstructions, and provide a basis for other developers to address the same problems in different ways. By making more tools available in open source form we will reduce the barriers of entry for new developers. We therefore expect that our work will provide a catalyst for an expansion of the development of software for EM structure solution. UK leadership in this area will provide a competitive advantage to our users and partners in UK industry.

Publications

10 25 50
publication icon
Agirre J (2023) The CCP4 suite: integrative software for macromolecular crystallography. in Acta crystallographica. Section D, Structural biology

publication icon
Joseph AP (2022) Atomic model validation using the CCP-EM software suite. in Acta crystallographica. Section D, Structural biology

 
Description Objective 1 and 3:
Implementation of automated shift field refinement as part of a larger software package requires code adaptation. This has been done and requires vetting from the owners of the library (TEMPy). A code contribution to the TEMPy library has been committed for calculating density which includes B-factor via GEMMI's density calculator. This is still currently in the developer's branch and not released. Similar process applies to the shift field algorithm. No other known users of the software since it has not been officially released.

Objective 2:
Collaborative work with a core developer from CCP-EM (Agnel) is ongoing with discussions of use cases and improvements to optimally apply both shift field and Flex-EM to any given problem.

Objective 4:
We have taken the approach to use shift field refinement in refining maps against a reference map of the same dataset to obtain a mask for the part with variable conformation.

Objective 5:
Initial investigations were carried out in collaboration with student Mateusz Olek, however this has largely been addressed through the "locscale" project, and so is no longer a priority.

Objective 6:
The code from objective 4 can be downconverted to 2d to address this problem, however that has not yet been completed.

Objective 7:
Source code for the standalone version of the software is on a Github repository.

Objective 8:
The data have been curated for release when we next publish the work.
Exploitation Route The software has been contributed to the CCP-EM project and back to the Flex-EM developers for further development. As part of this we have also provided the Flex-EM team with code for electron density calculation including B-factors, which were a missing feature for them.
Sectors Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description The incorporation of automated shift field refinement will be into the new version 2.0 of the CCP-EM software suite. A standalone version of the automated shift field refinement is committed to Github repository. (The Zenodo DOI is linked as a software output.) A code contribution to the TEMPy library has been committed for calculating density, which includes B-factor via GEMMI's density calculator, and shift field refinement. This is still currently in the developer's branch and not being released yet.
First Year Of Impact 2022
Sector Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Title scotthoh/pysheetbend: 0.5.4-alpha 
Description Full Changelog: https://github.com/scotthoh/pysheetbend/compare/v0.3.3-alpha...v0.5.4-alpha Working alpha release. pysheetbend refine_model_to_map to perform shift-field refinement for coordinates. Added documentation hosted at https://pysheetbend.readthedocs.io 
Type Of Technology Software 
Year Produced 2023 
Impact The software has been contributed back to the Flex-EM developers, adding a significant new feature in terms of electron density calculation with B-factors, and refinement. 
URL https://zenodo.org/record/7693012
 
Title scotthoh/pysheetbend: Preleases_v0.3.3-alpha 
Description Pre-release for self-testing. Code under development. 
Type Of Technology Software 
Year Produced 2022 
Impact Improvements to the underlying TEMPy code which will also support other developers. 
URL https://zenodo.org/record/6326049
 
Description CCP-EM Icknield Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited lecture and tutorial on automated de-novo model building into electron microscopy maps.
Year(s) Of Engagement Activity 2022
URL https://instruct-eric.org/events/2022-icknield-workshop-on-model-building-and-refinement/