CCP4 Advanced integrated approaches to macromolecular structure determination

Lead Research Organisation: University of York
Department Name: Chemistry

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Planned Impact

The generic importance of macromolecular crystallography in general and CCP4 in particular is provided in the Pathways to Impacts section.

CCP4 users in the pharmaceutical and biotechnology sector are most often involved in the study of protein-ligand (most often drug) complexes. The critical computational step in this process is molecular replacement (MR), in which a known atomic model from a similar structure is used to explain the diffraction pattern of the unknown structure. The MR approach is used in more than 70% of structure solutions. However it is not uncommon for the molecular replacement to yield a poor electron density map due to changes in the conformation of the protein. The software developed in this work package aims to significantly reduce the number of cases in which problems occur by increasing the range of convergence of the initial refinement of the MR model, while dramatically increasing the speed of the refinement step to allow screening many more candidate models.

Cryo-Electron Microscopy (EM) is an increasingly important method for the determination of the structure of pathogens and complexes. The same methods will also be implemented for cryo-EM data, where the resolution tolerance of the methods will facilitate the interpretation of lower resolution reconstructions.

Improvement of the protein model also improves the electron density for the unmodelled ligand or drug, since the electron density features of the known and unknown regions of the structure are related through the diffraction pattern. The speed and radius of convergence of the new method will increase the coverage of automated methods for high throughput screening, which are widely used in the commercial sector. The impact of these developments will be to reduce the number of cases where structure solutions fails, to reduce the level of manual intervention required in successful studies, and to increase the accuracy of the resulting structures.

YSBL has played a significant role in the commercial impact of CCP4: two YSBL-originated developments (the REFMAC and COOT software) have been the most-used tools in their field. Several other YSBL developments (DM, MOLREP, BUCCANEER, CCP4I) have citation counts in the hundreds to thousands are are significantly used in industry. The YSBL group engage with commercial customers through through commercial representation on the CCP4 Executive Committee and Working Groups 1 and 2, through workshops and the CCP4 bulletin board. CCP4 developers including the York group. The working groups provide guidance on which strategic planning is built.

The software produced will be added to the CCP4 suite, and where appropriate to the related CCP-EM software suite for electron microscopy. The CCP4 suite is in use world-wide and is available on Windows, Linux and Mac_OS platforms, providing a direct distribution channel to the overwhelming majority of macromolecular crystallographers. Libraries and methods will be available to other packages as well. CCP4 is updated with major version releases roughly every year, and automated updates on a roughly monthly basis to enable fast access to new developments. As a result, once the software has been added to the package it will within months be available to both the academic and commercial user community.

Publications

10 25 50
publication icon
Agirre J (2023) The CCP4 suite: integrative software for macromolecular crystallography. in Acta crystallographica. Section D, Structural biology

publication icon
Alharbi E (2021) Predicting the performance of automated crystallographic model-building pipelines. in Acta crystallographica. Section D, Structural biology

publication icon
Alharbi E (2019) Comparison of automated crystallographic model-building pipelines. in Acta crystallographica. Section D, Structural biology

publication icon
Alharbi E (2020) Pairwise running of automated crystallographic model-building pipelines. in Acta crystallographica. Section D, Structural biology

publication icon
Alharbi E (2023) Buccaneer model building with neural network fragment selection. in Acta crystallographica. Section D, Structural biology

publication icon
Bond PS (2020) Predicting protein model correctness in Coot using machine learning. in Acta crystallographica. Section D, Structural biology

publication icon
Bond PS (2022) ModelCraft: an advanced automated model-building pipeline using Buccaneer. in Acta crystallographica. Section D, Structural biology

publication icon
Cowtan K (2020) Structural barriers to scientific progress. in Acta crystallographica. Section D, Structural biology

publication icon
Cowtan K (2020) Shift-field refinement of macromolecular atomic models in Acta Crystallographica Section D Structural Biology

publication icon
Krissinel E (2022) CCP4 Cloud for structure determination and project management in macromolecular crystallography. in Acta crystallographica. Section D, Structural biology

 
Description The project consists of four deliverables. Progress against each of these is reported below:

Deliverable 1: To generalise and optimise shift field methods for wider applicability against
multiple data sources.
? COMPLETED
? The shift field refinement software has been adapted for electron microscopy use. It has also been extended to allow refinement of maps rather than models. This novel approach allows a range of new applications not supported by existing methods, including refinement of EM maps to phase cystallographic data, and non-rigid refinement of NCS domains.

Deliverable 2: To implement new resolution independent model building framework in BUCCANEER and NAUTILUS.
? IN PROGRESS
? Paul Bond has demonstrated a new machine learning approach to feature interpretation in electron density maps for model building using GPUs. We are beginning to test this to evaluate its impact on model building.
? Paul Bond has developed a new machine learning approach to the growing of protein chains using a neural network to identify likely chain conformations. Results suggest significantly improved performance at low resolutions. The method is currently being retrained against a larger dataset. We are evaluating how to build this work into a future release.

Deliverable 3: To develop new stereochemical regularisation software library to enable critical speed improvements.
? COMPLETED
? Two new high performance regularization algorithms have been implemented. The first, described as a pseudo-regularizer, moves overlapping fragments of the input model onto the refined model to restore geometry. This is instantaneous. The second, a general purpose regularizer, is fast but not instantaneous. This will have application in future work to provide web and cloud based software for model building and refinement.

Deliverable 4: To implement contact predictions developed in WP3 within BUCCANEER and NAUTILUS
? In Progress
? We are discussing ways forward with this with the Liverpool group, subject to staff availability there.
Exploitation Route We have been exploring a potential collaboration with Maya Topf's Flex-EM group in London to extend their software using the methods developed here on the basis of the new results described above. That has lead to a responsive mode BBSRC grant to explore this application.
Sectors Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology

 
Description The software and methods we have developed have been contributed (along with other software) to the CCP4 and CCP-EM software suites which are licensed to industrial users at over 100 sites worldwide, raising a license income of over £1.4m/year. We engage directly with industrial users through the CCP4 working group 1 and CCP4 and CCP-EM annual symposia, through workshops, as well as on an ad-hoc basis in relation to individual problems. Particular developments arising from this project include extension of our existing model building methods to larger structures and to the further automation of improving models for deposition.
First Year Of Impact 2022
Sector Manufacturing, including Industrial Biotechology,Pharmaceuticals and Medical Biotechnology
 
Title Sheetbend software for model morphing with non-atomic parameterizations. 
Description Software for optimizing a 3D model of a biological molecule to best explain X-ray or electron microscopy observations. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact This release was an update in response to enquiries from other software developers who wished to experiment with incorporating the software in their own software pipelines. 
URL http://fg.oisin.rc-harwell.ac.uk/projects/clipper-progs/
 
Title paulsbond/modelcraft: v2.4.1 
Description Fixed Catching connection errors when requesting PDB entry contents Ensuring MTZ data items use the same ASU definition 
Type Of Technology Software 
Year Produced 2022 
Impact Software released through CCP4 
URL https://zenodo.org/record/6821716
 
Description CCP-EM Icknield Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited lecture and tutorial on automated de-novo model building into electron microscopy maps.
Year(s) Of Engagement Activity 2022
URL https://instruct-eric.org/events/2022-icknield-workshop-on-model-building-and-refinement/
 
Description CCP-EM Icknield Workshop on Model Building and Refinement for High Resolution EM Maps 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact 'Icknield Workshop on Model Building and Refinement for High Resolution EM Maps' Apr 2010 This course is aimed at structural biologists with high resolution EM maps ready for / in the process of modelling building and refinement. This three day course will host some of the leading software developers and provide ample contact time to allow delegates to discuss their data in detail alongside traditional lectures and tutorials. The principal benefit to the participants was an awareness of tools which can perform de-novo model building in high resolution EM maps, removing the model bias associated with fitting pre-determined structures and facilitating the use of EM when no prior structure is available. The principal benefit to us was contact with real EM data and users, giving us a better awareness of the problems to be solved.
Year(s) Of Engagement Activity 2019,2020
URL https://www.ccpem.ac.uk/training/icknield_2019/icknield_2019.php
 
Description From special interests to social anxiety: autism in academia 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Everyone has a unique brain and therefore different skills, abilities, and ways to contribute at work. In science there are many research challenges and ways to approach them, but how inclusive are we to neurodivergent scientists?
Year(s) Of Engagement Activity 2022
URL https://ncas.ac.uk/from-special-interests-to-social-anxiety-autism-in-academia/
 
Description Presentation at CCP-EM Spring Symposium 2019 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited talk on Extending model building and refinement tools for Cryo-EM applications at the CCP4 symposium, Nottingham, Apr 2019
Year(s) Of Engagement Activity 2019
URL https://www.youtube.com/watch?v=evbJV6431EA