CCP4 Advanced integrated approaches to macromolecular structure determination

Lead Research Organisation: University of York
Department Name: Chemistry

Abstract

Proteins, DNA and RNA are the active machines of the cells which make up living organisms, and are collectively known as macromolecules. They carry out all of the functions that sustain life, from metabolism through replication to the exchange of information between a cell and its environment. They are coded for by a 'blueprint' in the form of the DNA sequence in the genome, which describes how to make them as linear strings of building blocks. In order to function, however, most macromolecules fold into a precise 3D structure, which in turn depends primarily on the sequence of building blocks from which they are made. Knowledge of the molecule's 3D structure allows us both to understand its function, and to design chemicals to interfere with it.
Due to advances in molecular biology, a number of projects, including the Human Genome Project, have led to the determination of the complete DNA sequences of many organisms, from which we can now read the linear blueprints for many macromolecules. As yet, however, the 3D structure cannot be predicted from knowledge of the sequence alone. One way to "see" macromolecules, and so to determine their 3D structure, involves initially crystallising the molecule under investigation, and subsequently imaging it with suitable radiation.
Macromolecules are too small to see with normal light, and so a different approach is required. With an optical microscope we cannot see objects which are smaller than the wavelength of light, roughly 1 millionth of a metre: Atoms are about 1000 times smaller than this. However X-rays have a wavelength about the same as the size of the atoms. For this reason, in order to resolve the atomic detail of macromolecular structure, we image them with X-rays rather than with visible light.
The process of imaging the structures of macromolecules that have been crystallised is known as X-ray crystallography. X- ray crystallography is like using a microscope to magnify objects that are too small to be seen with visible light. Unfortunately X-ray crystallography is complicated because, unlike a microscope, there is no lens system for X-rays and so additional information and complex computation are required to reconstruct the final image. This information may come from known protein structures using the Molecular Replacement (MR) method, or from other sources including Electron Microscopy (EM).
Once the structure is known, it is easier to pinpoint how macromolecules contribute to the living cellular machinery. Pharmaceutical research uses this as the basis for designing drugs to turn the molecules on or off when required. Drugs are designed to interact with the target molecule to either block or promote the chemical processes which they perform within the body. Other applications include protein engineering and carbohydrate engineering.
The aim of this project is to improve the key computational tools needed to extract a 3D structure from X-ray and electron diffraction experiments. It will provide continuing support to a Collaborative Computing Project (CCP4 first established in 1979), which has become one of the leading sources of software for this task. The project will help efficient and effective use to be made of the synchrotrons that make the X-rays that are used in most crystallographic experiments but also extend to use of electron microscopes which have gained much recent publicity with the Nobel prize being awarded to researchers from this field. It will provide more powerful tools to allow users to exploit information from known protein structures when the match to the unknown structure is very poor. Finally, it will allow structures to be solved, even when poor quality and very small crystals are obtained.

Technical Summary

This proposal incorporates four related work packages.
In WP1 we will expand on our work using established and novel metrics of data quality and consistency to quantify the relationship between diffraction and map quality. The tools will be used to optimise approaches to structure determination from multiple or serial crystallography data to enable optimal selection of collected data and fully utilise all the information in structural refinement. WP1 will also develop and implement methods for electron diffraction data collection, integration and refinement.
WP2 will utilise generalise the use shift field refinement and extend its usage to hybrid refinement approaches and develop new software libraries to enhance and speed up protein structure model building and refinement across a wide resolution range.
In WP3 we will develop and implement the use of contact prediction methods for use in crystallography. It will help identify protein domain boundaries, define new search model approaches. The contact prediction approach will also be used to validate Molecular replacement solutions and assist in the interpretation of crystallographically derived protein:protein contacts.
In WP4 we will develop a model for electron scatter from macromolecular samples to enable software development and experimental design. These models will be used to develop and implement new scaling algorithms for electron diffraction data within DIALS.

Planned Impact

Brown (University of Kent) is the chairman chairman of CCP4, and has overall responsibility to the grant-funding agencies (currently BBSRC and MRC) and commercial license holders for delivery of CCP4s software development, maintenance, distribution, and outreach programs. The impact of their respective contributions, therefore, relates to the specific proposed program of work from the co-applicant centres (summarised in each of the relevant grant applications), to the output of CCP4 as a whole, and to the array of basic and applied macromolecular crystallography (MX) that depends upon CCP4. In this application we have specifically request support for the WP2 lead Tews (Southampton) to add directly allocated additional management resource to impact and outreach.
MX is an essential enabling technology for the cellular and molecular biosciences, and consequently for UK pharmaceutical and biotechnological industries. UK research councils and research charities have recognised the need for infrastructural support of the discipline, most recently by assigning the last available Phase III slot at the Diamond Light Source (DLS) to a state-of-the-art facility for micro-focus and in situ crystallography for the academic and commercial MX community. In turn, the biotechnological and pharmaceutical science base that is fostered by such investments contributes hugely to the UK economy: in 2010 the pharmaceutical sector provided 67,000 jobs, each contributing £195,000 of GVA, with 25,000 of these positions being in high skill R&D activities (source: http://www.abpi.org.uk). In particular the majority of industrial access to DLS is for MX - amounting to almost 20% of the total user activity in MX.
Collaborative Computational Project 4 (CCP4) was established by the Research Councils in 1979 to promote the development and dissemination of software and best-practice in MX. To achieve this end, it uses Research Council funding to leverage a circa 3x larger commercial income, which it invests in MX training and in software development, maintenance, and distribution. As such, grant funding of CCP4 has the additional impact of strengthening an important interface between UK academic and commercial scientific endeavours.
CCP4s dissemination activities include the hosting of an annual methods-development meeting, attended typically by 400- 500 graduate students, young researchers and PIs. It also co-sponsors with the British Crystallographic Association annual week-long graduate summer schools, held alternately in Scotland and England, at which a cohort of 40+ students are intensively trained in current methods in protein crystallography. These two elements, supported primarily by CCP4s commercial license income, serve to keep the UK at the forefront of methods development in this central technique, and ensure that a pool of well-trained, increasingly interdisciplinary scientists are available to apply the technique in academic and/or commercial settings.
The program of work described here, for which Brown will have ultimate oversight, looks forward to the next stage of the development of MX and the emerging field of electron diffraction, to address some of the outstanding obstacles that limit the success and/or efficient application of the technique. By allowing structures to be determined from ever more challenging targets, including membrane proteins and proteins for which sample preparation is inherently difficult, this work will impact directly upon areas of biomedical and otherwise commercial interest. For example, WP1 and WP4 address the emerging area of electron diffraction and cutting edge approaches to improve efficiency in data collection and data usage. WP2 focuses on the new area oh hybrid modelling to build proteins against crystallographic data and WP3 uses state of the art contact prediction methods and applies them to crystallographic structure solution and interpretation.

Publications

10 25 50
 
Description Progress is itemized by workpackage below:

WP1. Quantifying relationships between diffraction and map quality.

6 papers and 2 software products have been produced. Progress by deliverable is as follows:
? To expand a METRIX database and develop machine learning to support interactive decision making through rapid density map quality assessment. DONE
? To develop Fourier optics based relationships between diffraction, phase and map quality applicable to scattering experiments including X-ray diffraction, IN PROGRESS
electron diffraction and cryoEM.
? To develop new tools for refinement against electron diffraction data. IN PROGRESS, SOME CODE RELEASED

WP2. Develop shift field refinement for Hybrid Structural modelling

7 papers and 1 software product have been produced. Progress by deliverable is as follows:
? To generalise and optimise shift field methods for wider applicability against multiple data sources.
DONE
? To implement new resolution independent model building framework in BUCCANEER and NAUTILUS.
IN PROGRESS WITH PRELIMINARY RESULTS
? To develop new stereochemical regularisation software library to enable critical speed improvements.
DONE
? To implement contact predictions developed in WP3 within BUCCANEER and NAUTILUS.
TO DO.

WP3. Implementation and exploitation of contact prediction approaches in crystallographic structure solution and validation.

NB. The deliverables for this workpackage have been changed significantly, in consultation with BBSRC, to reflect the impact of AlphaFold 2
15 papers, 4 software products and 2 software updates have been produced. Progress by deliverable is as follows:
? To develop tools that utilise contact and distance prediction to assist in validation of protein structures
DONE
? To implement contact prediction methods in ConKit to accurately define domain boundaries
DONE
? To use contact predictions to differentiate between biologically-relevant interfaces and crystal lattice contacts in crystal structures
IN PROGRESS
? To use contact predictions to help model building and completion
IN PROGRESS

WP4. Physical Model for Electron Diffraction

1 paper published, 1 online service setup for ED simulations, ED simulations conducted for 3 biomolecules: Crambin, Biotin and Ireloh.1 presentation, 2 software products and 1 dataset have been produced.Progress by deliverable is as follows:
? To develop a model of electron scattering for computer simulations to assist in software and experimental design.
DONE
? To develop and implement scaling algorithms for electron diffraction
TO DO
? To link to WP1 to develop new tools for refinement for electron diffraction data.
IN PROGRESS
Exploitation Route All of the software products are open source for future development by other authors, and are also made available for incorporation in external software pipelines, including in the pharmaceutical industry.
Sectors Healthcare,Pharmaceuticals and Medical Biotechnology

URL https://www.ccp4.ac.uk/
 
Description Impact ? Driving the ED&I agenda and appointing an ED&I team ? Organising online events - Study Weekend 2022 online was a very successful meeting with 1,088 participants! ? Most successful year for working group 2 meetings with higher attendance numbers (generally over 45) ? Initial planning for a new working group 2 meeting schedule with more, more inclusive, and shorter meetings. ? Produce new public-facing web pages for the project with modern presentation and improved accessibility. ? Starting the CCP4 Documentation project to make outputs more accessible. 2 publications have also been associated with the impact part of the project under Ivo Tews.
First Year Of Impact 2021
Sector Digital/Communication/Information Technologies (including Software),Education,Pharmaceuticals and Medical Biotechnology
Impact Types Societal

 
Title paulsbond/modelcraft: v2.4.1 
Description Fixed Catching connection errors when requesting PDB entry contents Ensuring MTZ data items use the same ASU definition 
Type Of Technology Software 
Year Produced 2022 
Impact Software released through CCP4 
URL https://zenodo.org/record/6821716
 
Description CCP-EM Icknield Workshop 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Invited lecture and tutorial on automated de-novo model building into electron microscopy maps.
Year(s) Of Engagement Activity 2022
URL https://instruct-eric.org/events/2022-icknield-workshop-on-model-building-and-refinement/
 
Description From special interests to social anxiety: autism in academia 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Everyone has a unique brain and therefore different skills, abilities, and ways to contribute at work. In science there are many research challenges and ways to approach them, but how inclusive are we to neurodivergent scientists?
Year(s) Of Engagement Activity 2022
URL https://ncas.ac.uk/from-special-interests-to-social-anxiety-autism-in-academia/