CCPN - A Collaborative computational project for macromolecular NMR spectroscopy

Lead Research Organisation: University of Cambridge
Department Name: Biochemistry

Abstract

The CCPN is a Collaborative Computing Project for the study of biological macromolecules and the determination of their structures by NMR spectroscopy. Our aim is to promote collaboration between NMR software developers and to make better software available to the NMR user community. Our goal is to ensure that the different computer programs used by NMR spectroscopists work together and use each others' results in a seamless fashion. Secondly, we want to make sure that all the experimental results can easily be collated and made available to others. We also arrange meetings and workshops to define, and spread knowledge about the best ways of carrying out NMR studies. Since the CCPN started in 2000, we have defined a standard way of describing scientific data in our particular area. We have developed large program libraries to make it easier to write programs that read and write data in this standard form. We have also written a new program for the analysis of NMR spectra, and we have worked with other groups to adapt their programs to use data defined in this standard way. It is now possible to determine the structure of a protein by NMR spectroscopy, using only programs that use data in this standard form. We have additionally written programs to make it easier to write and maintain both the data standard and its program libraries, and we have arranged workshops and annual conferences. The new application has five main aims: 1. We want to provide a program to process raw NMR data that uses the data standard, organised so that it can be used to process NMR data in different ways and so that it is easy to add new methods. 2. We want to collaborate with other groups to expand the support for different types of NMR analysis. Among other things we shall be adding support for solid state NMR studies, and we will collaborate with two pharmaceutical companies to write programs that can be used in small molecule screening and optimisation by NMR spectroscopy. 3. In collaboration with others we want to combine the programs that interact with the data standard into a tightly integrated software pipeline where the programs speak directly to each other, and where you can compare the use of different programs for the same task. The entire pipeline should be easy to install, and parts of it will be made available and run as web services. At the end of the grant we hope to be able to run the pipeline in a fully automatic mode, so that one can generate a structure automatically, directly from the raw NMR data. 4. We shall continue to support our existing users, improve our documentation and maintain the large amounts of code we have already written. 5. And we shall continue to arrange courses, workshops and annual conferences. Our work helps those who use NMR spectroscopy by providing programs that are more powerful and easier to use and install. It also makes it easier to write those programs, and saves effort by letting new programs make better use of existing code. Ultimately we shall increase both the amount and quality of experimental NMR data that is deposited in public databanks and made available to others. More specifically we hope to make our programs more widely used in industry by adding support for industry-relevant tasks, and to make macromolecular NMR data sufficiently simple to analyse that it can be done by non-specialists without excessive training.

Technical Summary

The CCPN is a Collaborative Computing Project for the macromolecular NMR community. Its purpose is to create data standards, promote software integration, produce and disseminate software, and organise conferences and workshops. Over the next three years we intend to: 1) Provide a program to process raw NMR data that is integrated with the CCPN data standard, and which can be used to process NMR data in different ways and be easily modified in the future. 2) Collaborate with other groups to expand support for the analysis of further types of NMR data. 3) Combine computer programs that support the CCPN data standard into a closely integrated pipeline, and work towards automated structure determination starting from raw NMR data. 4) Continue to support our existing users, improve our documentation and maintain the large body of code we have already written. 5) Continue to organise workshops and study weekends for the NMR community. We shall add support for relaxation analysis, reduced dipolar couplings, and solid state NMR. By adding support for small molecule screening and fast structure determination of protein-ligand complexes we hope to make our software more attractive to the pharmaceutical industry. The software will be easy to install, and designed to work either on a Users' own computer or to make use of remote web services. We aim to make NMR more accessible to non-specialists, e.g. by developing a pilot software pipeline that can be used to determine structures directly from raw NMR data. We shall continue to support our users, fix problems, improve documentation, and disseminate our programs through courses and workshops. We plan to make a special effort to improve tools and documentation for programmers, and to spread expertise in using our code to other groups. We will continue to organize meetings, notably our popular annual meetings for the NMR community to discuss and disseminate the best practice for studies of macromolecules using NMR.

Publications

10 25 50

publication icon
Doreleijers JF (2012) CING: an integrated residue-based structure validation program suite. in Journal of biomolecular NMR

publication icon
Fogh RH (2010) MEMOPS: data modelling and automatic code generation. in Journal of integrative bioinformatics

publication icon
Gutmanas A (2015) NMR Exchange Format: a unified and open standard for representation of NMR restraint data. in Nature structural & molecular biology

publication icon
Penkett CJ (2010) Straightforward and complete deposition of NMR data to the PDBe. in Journal of biomolecular NMR

publication icon
Ragan TJ (2015) Analysis of the structural quality of the CASD-NMR 2013 entries. in Journal of biomolecular NMR

publication icon
Skinner SP (2015) Structure calculation, refinement and validation using CcpNmr Analysis. in Acta crystallographica. Section D, Biological crystallography

publication icon
Stevens TJ (2011) A software framework for analysing solid-state MAS NMR data. in Journal of biomolecular NMR

 
Description CCPN, an ongoing collaborative computing project, works on software for macromolecular NMR spectroscopy and structure determination. This is a computer-intensive field, involving large amounts of data. Each study is analysed through several independent steps of automatic and interactive programs, and detailed results are deposited in public databases. It has been difficult for programs to exchange data, and in general to maintain consistency and control for the large amounts of data. CCPN produces and maintains:
- The CCPN data exchange standard for NMR and macromolecular structure, that allows data to be exchanged, organised, and stored; extensive subroutine libraries to support programs using the data standard; the MEMOPS code generation framework to generate and maintain synchronized subroutine libraries in different computing languages.
- The CcpNmr suite of programs for visualization and analysis of macromolecular NMR spectra, distributed for Windows, Linux and Mac computers, with facilities for a broad range of tasks, and converting data from external formats.
- Collaboration on integrating third-party NMR software, so that programs can be run easily from a single starting point and can make use of results from each other.
During the relvant granting period, CCPN has released the CcpNmr suite. The largest program is CcpNmr Analysis, for display, interactive assignment and analysis of NMR spectra. Another important program is CcpNmr FormatConverter, which uses the CCPN data standard as a stepping stone to convert to and from over 30 data formats for NMR and structural biology. The CcpNmr suite was released for Linux, Windows, and Mac OSX. The data standard was extended with supporting subroutines in Java, backed by either XML files or relational databases for storage. Residue templates for all monomers known to the PDB were prepared and released, with a system to keep the set updated in the future.
CcpNmr FormatConverter played a key role in the DOCR/FRED data reclamation project, that extracted structures and NMR restraints deposited to the PDB and BMRB and made them available in usable form.
CCPN has collaborated with Bruker Spectrospin on CCPN data export directly from the Bruker TOPSPIN program. We have collaborated with the Nilges and Vuister groups to set up integration with the ARIA structure detemination program and the QUEEN validation suite, so that data can be selected for calculation from CcpNmr Analysis. We have spent a great deal of effort collaborating with the PIMS project on a laboratory information management system. The collaboration included changing the CCPN data model and code generation framework to be more PIMS-friendly, and jointly expanding the data standard to cover the requirements of a LIMS system, such as sample production and tracking, biochemical workflow, and target selection. We have further set up the CCPNGRID web server, allowing external groups to run ARIA and Queen calculations remotely.
CCPN has run a number of courses for programmers and CcpNmr users, and has aranged three UK NMR conferences specifically aimed at introducing young researchers to best practice in the field. Our user base includes four international companies as paying customers.
Exploitation Route As of 2014 the CCPN project is continuing to support and develop the results obtained so far, financed by a mixture of research council funding, user fees, and paid cooperation agreements with both industrial (Novartis) and academic groups (Imperial College, London). The yearly UK NMR conference and the program of user courses and tutorials both continue to be supported, and the number of users keeps growing, lately with the addition of a number of research groups at Imperial College.
CCPN is currently engaged in a major rewrite of the entire CcpNmr Analysis software suite, version 3, with the aim
- To upgrade to a modern graphics package that allows for a more powerful user interface.
- To simplify working with the program for all but the most complex problems, making it more attractive and easier to learn and use.
- To make the program easier to extend, also for users without a computing background.
- To simplify future maintenance by increasing modularity, simplicity, and increased use of standard software libraries.
CCPN is part of a number of collaborations that aim to bring in external groups to develop and maintain modules within the CCpNmr version 3 framework, in fields like solid state NMR analysis. We also have a central role in the efforts to standardize the deposition of NMR data, and are coordinating the development of an agreed data storage format for NMR-related software packages.
Sectors Chemicals,Healthcare,Pharmaceuticals and Medical Biotechnology

URL http://www.ccpn.ac.uk/ccpn
 
Description As an ongoing collaborative computing project, CCPN serves as promoter of an NMR community with a common computing environment. CCPN maintains the data exchange standard for the field, has a central role in collaborations to deliver software integration and pipelines, and maintains the only fully capable long-lived NMR analysis platform. These efforts feed into making NMR analysis tools more capable, easier and faster to use, increasing the productivity of the scientists that use them. The impact of CCPN follows from this increased productivity. One measure of impact is simply the take-up of the CcpNmr software suite. As of end 2011 the major program, CcpNmr Analysis, has about 1000 users at over 150 sites worldwide, with CcpNmr FormatConverter used even more widely. The CcpNmr suite is emerging as the dominant program for solid state NMR analysis, and has recently been extended to metabolomics. CCPN software and data standards play a key role in the deposition of data at the wwDPB and BioMagResBank, and in the remediation and re-use of already deposited data. The project has continued since the end of this grant, and as of end 2015 we estimate that CcpNmr Analysis has several thousand users worldwide. CCPN is a key member of WeNMR, the European collaboration providing Grid computing and software pipelines for NMR. The impact is not limited to academia. Results deposited in public databases are a crucial resource also for companies in the field of biochemistry. As of end 2011 nine companies, mainly large multinationals, are paying to use the CcpNmr software suite for pharmaceutical or agrochemical research. Many of these are actively collaborating in the development of the new CcpNmr program for drug candidate screening by NMR. CCPN plays an important role in training future researchers in the field of NMR, both through its conferences, designed to introduce young researchers to best practice in the NMR field, and through training courses run by CCPN. One private company, SpronkNMR of Vilnius, Lithuania, is running courses in the use of CcpNmr software on a commercial basis.
First Year Of Impact 2013
Sector Chemicals,Healthcare,Pharmaceuticals and Medical Biotechnology
Impact Types Economic