CCP4 Grant Renewal 2014-2019: Question-driven crystallographic data collection and advanced structure solution
Lead Research Organisation:
University of Cambridge
Department Name: Cambridge Institute for Medical Research
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
This proposal incorporates five related work packages.
In WP1 we will track synchrotron-collected data through computational structure determination, to find whether the most useful data can be recognised a priori using established or novel metrics of data quality and consistency. We will then enable data collection software to communicate with pipelines and graphics programs to assess when sufficient data have been collected for a given scientific question, and so to prioritise further beamtime usage. We will also communicate extra information about diffraction data to structure determination programs, and so support the statistical models and algorithms being developed in WP4.
WP2 will improve the key MR step of model preparation, especially from diverged, NMR, or ab initio models. One development will be to extend the size limit of ab initio search model generation by exploiting sequence covariance algorithms.
In WP3 we will use our description of electron density maps as a field of control points to better use electron density or atomic models positioned by MR. Restrained manipulation of these points provides a low-order parameterisation of refinement decoupled from atomic models, and therefore suitable for highly diverged atomic models or EM-derived maps. We will extend this approach to characterise local protein mobility without the requirement of TLS for predefinition of rigid groups.
In WP4 we will statistically model non-idealities in experimental data, including non isomorphism, spot overlap, and radiation damage. The resulting models, implemented in REFMAC, will be applied to refinement using data that are annotated by WP1 tools and tracked by WP0.
WP0 will provide the tools to integrate the other WPs. For this, it will create a cloud environment where storage- and compute-resources can be utilised optimally, and where rich information can be passed among beamlines, pipelines, and graphics programs.
In WP1 we will track synchrotron-collected data through computational structure determination, to find whether the most useful data can be recognised a priori using established or novel metrics of data quality and consistency. We will then enable data collection software to communicate with pipelines and graphics programs to assess when sufficient data have been collected for a given scientific question, and so to prioritise further beamtime usage. We will also communicate extra information about diffraction data to structure determination programs, and so support the statistical models and algorithms being developed in WP4.
WP2 will improve the key MR step of model preparation, especially from diverged, NMR, or ab initio models. One development will be to extend the size limit of ab initio search model generation by exploiting sequence covariance algorithms.
In WP3 we will use our description of electron density maps as a field of control points to better use electron density or atomic models positioned by MR. Restrained manipulation of these points provides a low-order parameterisation of refinement decoupled from atomic models, and therefore suitable for highly diverged atomic models or EM-derived maps. We will extend this approach to characterise local protein mobility without the requirement of TLS for predefinition of rigid groups.
In WP4 we will statistically model non-idealities in experimental data, including non isomorphism, spot overlap, and radiation damage. The resulting models, implemented in REFMAC, will be applied to refinement using data that are annotated by WP1 tools and tracked by WP0.
WP0 will provide the tools to integrate the other WPs. For this, it will create a cloud environment where storage- and compute-resources can be utilised optimally, and where rich information can be passed among beamlines, pipelines, and graphics programs.
Planned Impact
The generic importance of macromolecular crystallography in general and CCP4 in particular is provided in the Pathways to Impacts section.
Molecular Replacement (MR) is an increasingly common route to solving the phase problem for protein crystal structures, its popularity arising from being fast and cheap. In 2012, 77% of protein structures submitted to the PDB were tagged as solved using MR. MR ranges in difficulty from positioning models that accurately model the scattering from the entire asymmetric unit to positioning very small components of the total scattering with high error. Essential to the success of MR is the availability of a search model that represents a portion of the unknown structure accurately enough that, once placed, it provides approximate phasing information allowing for further interpretation of the resulting electron density maps. WP2 aims to improve the efficiency and applicability of MR, enhancements that will reduce the time spent by the crystallographer on structure solution and extend the proportion of targets soluble by the technique. To do this WP2 will improve methods to assemble search models from conventional sources such as homology models and NMR structures. It will further build on recent innovations exploiting a novel source of structural information - low computational-cost, fragment-assembly derived ab initio models, as implemented in the CCP4 program AMPLE, extending the method to membrane proteins. The nascent technique of predicted contact-based ab initio modelling will further be explored, potentially allowing some large novel protein folds to be solved by MR for the first time. The predominance of MR as a structure solution method ensures that the entire crystallographic community will benefit from these improvements and so, in turn, will researchers in the many biological communities for whom protein structure information is valuable.
The software developed in WP2 will be added to the CCP4 suite. The CCP4 suite is used world-wide and is available on Windows, Linux and Mac_OS platforms, providing a direct distribution channel to macromolecular crystallographers. CCP4 has recently introduced and automated update mechanism to enable faster access to new developments. As a result developments in WP2 will be available immediately to the user community.
Although the focus in WP2 is on structural bioinformatics for crystallographic ends, we envisage that some of the methods we will develop to process and refine ab initio models will prove valuable to a broader bioinformatics community. For example, refinement of predicted contact-based models with Rosetta or other fragment-based protocols has not yet been done. Our benchmarking will indicate whether it provides a general method to improve local or global quality of the models: such a protocol would obviously be valuable to a broad modelling community. Similarly, incorporation of predicted contacts from the latest generation of covariance software into fragment assembly ab initio modelling is novel: the benefits of using larger or smaller numbers of predictions will become apparent through our benchmarking and will again be of broad benefit to protein modellers.
Molecular Replacement (MR) is an increasingly common route to solving the phase problem for protein crystal structures, its popularity arising from being fast and cheap. In 2012, 77% of protein structures submitted to the PDB were tagged as solved using MR. MR ranges in difficulty from positioning models that accurately model the scattering from the entire asymmetric unit to positioning very small components of the total scattering with high error. Essential to the success of MR is the availability of a search model that represents a portion of the unknown structure accurately enough that, once placed, it provides approximate phasing information allowing for further interpretation of the resulting electron density maps. WP2 aims to improve the efficiency and applicability of MR, enhancements that will reduce the time spent by the crystallographer on structure solution and extend the proportion of targets soluble by the technique. To do this WP2 will improve methods to assemble search models from conventional sources such as homology models and NMR structures. It will further build on recent innovations exploiting a novel source of structural information - low computational-cost, fragment-assembly derived ab initio models, as implemented in the CCP4 program AMPLE, extending the method to membrane proteins. The nascent technique of predicted contact-based ab initio modelling will further be explored, potentially allowing some large novel protein folds to be solved by MR for the first time. The predominance of MR as a structure solution method ensures that the entire crystallographic community will benefit from these improvements and so, in turn, will researchers in the many biological communities for whom protein structure information is valuable.
The software developed in WP2 will be added to the CCP4 suite. The CCP4 suite is used world-wide and is available on Windows, Linux and Mac_OS platforms, providing a direct distribution channel to macromolecular crystallographers. CCP4 has recently introduced and automated update mechanism to enable faster access to new developments. As a result developments in WP2 will be available immediately to the user community.
Although the focus in WP2 is on structural bioinformatics for crystallographic ends, we envisage that some of the methods we will develop to process and refine ab initio models will prove valuable to a broader bioinformatics community. For example, refinement of predicted contact-based models with Rosetta or other fragment-based protocols has not yet been done. Our benchmarking will indicate whether it provides a general method to improve local or global quality of the models: such a protocol would obviously be valuable to a broad modelling community. Similarly, incorporation of predicted contacts from the latest generation of covariance software into fragment assembly ab initio modelling is novel: the benefits of using larger or smaller numbers of predictions will become apparent through our benchmarking and will again be of broad benefit to protein modellers.
Organisations
Publications
Hatti KS
(2021)
Likelihood-based estimation of substructure content from single-wavelength anomalous diffraction (SAD) intensity data.
in Acta crystallographica. Section D, Structural biology
Hatti KS
(2020)
Factors influencing estimates of coordinate error for molecular replacement.
in Acta crystallographica. Section D, Structural biology
McCoy A
(2018)
Gyre and gimble : a maximum-likelihood replacement for Patterson correlation refinement
in Acta Crystallographica Section D Structural Biology
McCoy AJ
(2017)
Acknowledging Errors: Advanced Molecular Replacement with Phaser.
in Methods in molecular biology (Clifton, N.J.)
Millán C
(2018)
Exploiting distant homologues for phasing through the generation of compact fragments, local fold refinement and partial solution combination.
in Acta crystallographica. Section D, Structural biology
Oeffner RD
(2018)
On the application of the expected log-likelihood gain to decision making in molecular replacement.
in Acta crystallographica. Section D, Structural biology
Read RJ
(2018)
Maximum-likelihood determination of anomalous substructures.
in Acta crystallographica. Section D, Structural biology
Description | We have implemented new features in our software for crystallography, which will increase the success rate in finding new structures of protein molecules. |
Exploitation Route | Both Phaser on its own, and as used in the Arcimboldo pipeline, now has a higher success rate in solving difficult structures. |
Sectors | Pharmaceuticals and Medical Biotechnology |
URL | http://www.phaser.cimr.cam.ac.uk/index.php/Phaser_Crystallographic_Software |
Description | Our software is heavily used by scientists in the pharmaceutical industry for drug design and development. Our work is helping to improve their productivity. |
First Year Of Impact | 2016 |
Sector | Pharmaceuticals and Medical Biotechnology |
Impact Types | Economic |
Title | Gyre and Gimble algorithms for molecular replacement technique |
Description | The new algorithms allow the use of likelihood-based optimisation to improve the orientations and relative positions of rigid molecular fragments in a molecular replacement structure solution process. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | These algorithms are useful for general users of our software Phaser. They are particularly useful for our collaborators in Isabel Uson's group, who apply these algorithms when searching for fragments based on structural databases. |
Title | expected log-likelihood-gain calculation (eLLG) |
Description | The eLLG calculation enables researchers to determine whether a molecular replacement calculation is likely to succeed or fail, thus focusing their efforts where they are most needed. It is also used increasingly within our Phaser program and structure determination pipelines that use Phaser, in order to devise optimal structure solution strategies. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2017 |
Provided To Others? | Yes |
Impact | Our software Phaser has become more effective as a result of the strategy optimisations that this enables. People who use Phaser for fragment-based molecular replacement calculations (Arcimboldo: Uson group, Ample: Rigden group, Fragon: Jenkins) can now optimise the size of fragments that they search for. |
Title | Phaser software |
Description | Phaser is a computer program that enables macromolecular crystallographers to solve structures using the methods of molecular replacement, single-wavelength anomalous diffraction, or a combination of the two. |
Type Of Technology | Software |
Year Produced | 2015 |
Open Source License? | Yes |
Impact | Over 40% of recent entries in the worldwide Protein Data Bank acknowledge the use of this software in the solution of the structures they describe. |
URL | http://www.phaser.cimr.cam.ac.uk/index.php/Phaser_Crystallographic_Software |
Description | CCP4 Northern Protein Structure Workshop, Carlisle |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Postgraduate students |
Results and Impact | Taught PhD students and post-doctoral fellows how to use our molecular replacement software, as part of a regional workshop. |
Year(s) Of Engagement Activity | 2016 |
URL | http://www2.le.ac.uk/departments/molcellbiol/staff/khushwant-sidhu/ccp4 |
Description | CCP4/BCA summer school |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Airlie McCoy provided advanced crystallography training in a one-week residential workshop. |
Year(s) Of Engagement Activity | 2016,2017,2018 |
URL | http://www.diamond.ac.uk/Home/Events/2016/BCA-Summer-School.html |
Description | Crystallographic computing training workshop (APS) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Training students to use advanced feature of crystallographic software. |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.ccp4.ac.uk/schools/APS-2015/ |
Description | Crystallographic computing training workshop (Diamond) |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Crystallographic computing training workshop |
Year(s) Of Engagement Activity | 2015 |
URL | http://www.ccp4.ac.uk/schools/DLS-2015/ |
Description | Diamond-CCP4 Data Collection and Structure Solution Workshop |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Airlie McCoy taught in a workshop to teach best practice in data collection and analysis for protein crystallography. |
Year(s) Of Engagement Activity | 2016,2017 |
URL | http://www.diamond.ac.uk/Home/Events/2017/Diamond-CCP4-Data-Collection-and-Analysis-workshop.html |
Description | SEACOAST: South-East Asian Crystallographic Overview And Systematic Training, Bangkok, Thailand |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Training of PhD students and post-doctoral fellows in the use of crystallographic software included in both the CCP4 and Phenix packages. The program included lectures, hands-on tutorials and one-on-one help with students in dealing with problems with their own data. |
Year(s) Of Engagement Activity | 2020 |
URL | https://seacoast.kmutt.ac.th |
Description | South American crystallography workshop, Brazil |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Taught aspects of protein crystallography to South American PhD students and post-doctoral fellows. |
Year(s) Of Engagement Activity | 2016,2018 |
URL | http://www.ifsc.usp.br/mx2016/ |
Description | South American crystallography workshop, Uruguay |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Taught aspects of protein crystallography in a workshop for South American PhD students and post-doctoral fellows. |
Year(s) Of Engagement Activity | 2015,2017 |
URL | http://pasteur.uy/en/last-news/mx2017 |