CCP4 Grant Renewal 2014-2019: Question-driven crystallographic data collection and advanced structure solution

Lead Research Organisation: University of Cambridge
Department Name: Cambridge Institute for Medical Research

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

This proposal incorporates five related work packages.

In WP1 we will track synchrotron-collected data through computational structure determination, to find whether the most useful data can be recognised a priori using established or novel metrics of data quality and consistency. We will then enable data collection software to communicate with pipelines and graphics programs to assess when sufficient data have been collected for a given scientific question, and so to prioritise further beamtime usage. We will also communicate extra information about diffraction data to structure determination programs, and so support the statistical models and algorithms being developed in WP4.

WP2 will improve the key MR step of model preparation, especially from diverged, NMR, or ab initio models. One development will be to extend the size limit of ab initio search model generation by exploiting sequence covariance algorithms.

In WP3 we will use our description of electron density maps as a field of control points to better use electron density or atomic models positioned by MR. Restrained manipulation of these points provides a low-order parameterisation of refinement decoupled from atomic models, and therefore suitable for highly diverged atomic models or EM-derived maps. We will extend this approach to characterise local protein mobility without the requirement of TLS for predefinition of rigid groups.

In WP4 we will statistically model non-idealities in experimental data, including non isomorphism, spot overlap, and radiation damage. The resulting models, implemented in REFMAC, will be applied to refinement using data that are annotated by WP1 tools and tracked by WP0.

WP0 will provide the tools to integrate the other WPs. For this, it will create a cloud environment where storage- and compute-resources can be utilised optimally, and where rich information can be passed among beamlines, pipelines, and graphics programs.

Planned Impact

The generic importance of macromolecular crystallography in general and CCP4 in particular is provided in the Pathways to Impacts section.

Molecular Replacement (MR) is an increasingly common route to solving the phase problem for protein crystal structures, its popularity arising from being fast and cheap. In 2012, 77% of protein structures submitted to the PDB were tagged as solved using MR. MR ranges in difficulty from positioning models that accurately model the scattering from the entire asymmetric unit to positioning very small components of the total scattering with high error. Essential to the success of MR is the availability of a search model that represents a portion of the unknown structure accurately enough that, once placed, it provides approximate phasing information allowing for further interpretation of the resulting electron density maps. WP2 aims to improve the efficiency and applicability of MR, enhancements that will reduce the time spent by the crystallographer on structure solution and extend the proportion of targets soluble by the technique. To do this WP2 will improve methods to assemble search models from conventional sources such as homology models and NMR structures. It will further build on recent innovations exploiting a novel source of structural information - low computational-cost, fragment-assembly derived ab initio models, as implemented in the CCP4 program AMPLE, extending the method to membrane proteins. The nascent technique of predicted contact-based ab initio modelling will further be explored, potentially allowing some large novel protein folds to be solved by MR for the first time. The predominance of MR as a structure solution method ensures that the entire crystallographic community will benefit from these improvements and so, in turn, will researchers in the many biological communities for whom protein structure information is valuable.

The software developed in WP2 will be added to the CCP4 suite. The CCP4 suite is used world-wide and is available on Windows, Linux and Mac_OS platforms, providing a direct distribution channel to macromolecular crystallographers. CCP4 has recently introduced and automated update mechanism to enable faster access to new developments. As a result developments in WP2 will be available immediately to the user community.

Although the focus in WP2 is on structural bioinformatics for crystallographic ends, we envisage that some of the methods we will develop to process and refine ab initio models will prove valuable to a broader bioinformatics community. For example, refinement of predicted contact-based models with Rosetta or other fragment-based protocols has not yet been done. Our benchmarking will indicate whether it provides a general method to improve local or global quality of the models: such a protocol would obviously be valuable to a broad modelling community. Similarly, incorporation of predicted contacts from the latest generation of covariance software into fragment assembly ab initio modelling is novel: the benefits of using larger or smaller numbers of predictions will become apparent through our benchmarking and will again be of broad benefit to protein modellers.

Publications

10 25 50

publication icon
Hatti KS (2020) Factors influencing estimates of coordinate error for molecular replacement. in Acta crystallographica. Section D, Structural biology

publication icon
McCoy A (2018) Gyre and gimble : a maximum-likelihood replacement for Patterson correlation refinement in Acta Crystallographica Section D Structural Biology

publication icon
McCoy AJ (2017) Acknowledging Errors: Advanced Molecular Replacement with Phaser. in Methods in molecular biology (Clifton, N.J.)

publication icon
Oeffner RD (2018) On the application of the expected log-likelihood gain to decision making in molecular replacement. in Acta crystallographica. Section D, Structural biology

publication icon
Read RJ (2018) Maximum-likelihood determination of anomalous substructures. in Acta crystallographica. Section D, Structural biology

 
Description We have implemented new features in our software for crystallography, which will increase the success rate in finding new structures of protein molecules.
Exploitation Route Both Phaser on its own, and as used in the Arcimboldo pipeline, now has a higher success rate in solving difficult structures.
Sectors Pharmaceuticals and Medical Biotechnology

URL http://www.phaser.cimr.cam.ac.uk/index.php/Phaser_Crystallographic_Software
 
Description Our software is heavily used by scientists in the pharmaceutical industry for drug design and development. Our work is helping to improve their productivity.
First Year Of Impact 2016
Sector Pharmaceuticals and Medical Biotechnology
Impact Types Economic

 
Title Gyre and Gimble algorithms for molecular replacement technique 
Description The new algorithms allow the use of likelihood-based optimisation to improve the orientations and relative positions of rigid molecular fragments in a molecular replacement structure solution process. 
Type Of Material Improvements to research infrastructure 
Year Produced 2018 
Provided To Others? Yes  
Impact These algorithms are useful for general users of our software Phaser. They are particularly useful for our collaborators in Isabel Uson's group, who apply these algorithms when searching for fragments based on structural databases. 
 
Title expected log-likelihood-gain calculation (eLLG) 
Description The eLLG calculation enables researchers to determine whether a molecular replacement calculation is likely to succeed or fail, thus focusing their efforts where they are most needed. It is also used increasingly within our Phaser program and structure determination pipelines that use Phaser, in order to devise optimal structure solution strategies. 
Type Of Material Improvements to research infrastructure 
Year Produced 2017 
Provided To Others? Yes  
Impact Our software Phaser has become more effective as a result of the strategy optimisations that this enables. People who use Phaser for fragment-based molecular replacement calculations (Arcimboldo: Uson group, Ample: Rigden group, Fragon: Jenkins) can now optimise the size of fragments that they search for. 
 
Title Phaser software 
Description Phaser is a computer program that enables macromolecular crystallographers to solve structures using the methods of molecular replacement, single-wavelength anomalous diffraction, or a combination of the two. 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact Over 40% of recent entries in the worldwide Protein Data Bank acknowledge the use of this software in the solution of the structures they describe. 
URL http://www.phaser.cimr.cam.ac.uk/index.php/Phaser_Crystallographic_Software
 
Description CCP4 Northern Protein Structure Workshop, Carlisle 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact Taught PhD students and post-doctoral fellows how to use our molecular replacement software, as part of a regional workshop.
Year(s) Of Engagement Activity 2016
URL http://www2.le.ac.uk/departments/molcellbiol/staff/khushwant-sidhu/ccp4
 
Description CCP4/BCA summer school 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Airlie McCoy provided advanced crystallography training in a one-week residential workshop.
Year(s) Of Engagement Activity 2016,2017,2018
URL http://www.diamond.ac.uk/Home/Events/2016/BCA-Summer-School.html
 
Description Crystallographic computing training workshop (APS) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Training students to use advanced feature of crystallographic software.
Year(s) Of Engagement Activity 2015
URL http://www.ccp4.ac.uk/schools/APS-2015/
 
Description Crystallographic computing training workshop (Diamond) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Crystallographic computing training workshop
Year(s) Of Engagement Activity 2015
URL http://www.ccp4.ac.uk/schools/DLS-2015/
 
Description Diamond-CCP4 Data Collection and Structure Solution Workshop 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Airlie McCoy taught in a workshop to teach best practice in data collection and analysis for protein crystallography.
Year(s) Of Engagement Activity 2016,2017
URL http://www.diamond.ac.uk/Home/Events/2017/Diamond-CCP4-Data-Collection-and-Analysis-workshop.html
 
Description SEACOAST: South-East Asian Crystallographic Overview And Systematic Training, Bangkok, Thailand 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Training of PhD students and post-doctoral fellows in the use of crystallographic software included in both the CCP4 and Phenix packages. The program included lectures, hands-on tutorials and one-on-one help with students in dealing with problems with their own data.
Year(s) Of Engagement Activity 2020
URL https://seacoast.kmutt.ac.th
 
Description South American crystallography workshop, Brazil 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Taught aspects of protein crystallography to South American PhD students and post-doctoral fellows.
Year(s) Of Engagement Activity 2016,2018
URL http://www.ifsc.usp.br/mx2016/
 
Description South American crystallography workshop, Uruguay 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Taught aspects of protein crystallography in a workshop for South American PhD students and post-doctoral fellows.
Year(s) Of Engagement Activity 2015,2017
URL http://pasteur.uy/en/last-news/mx2017