CCPN - A collaborative computational project for macromolecular NMR spectroscopy

Lead Research Organisation: University of Cambridge
Department Name: Biochemistry

Abstract

CCPN is a Collaborative Computing Project for the study of biological macromolecules and their structure by NMR spectroscopy. CCPN works to promote collaboration between NMR software developers and to make better software available to the NMR user community. Our goal is to ensure that the different NMR programs can work together and use each others' results in a seamless fashion. Secondly, we want to make sure that all the results can easily be collated and made available to others. We also arrange meetings and workshops to define, and spread knowledge about the best ways of working. Since the CCPN started in 2000, we have defined a standard way of describing scientific data in our particular area. To make it easier to write programs that read and write data in this standard form, we have developed large program libraries for this purpose. We have also written a new program for the analysis of NMR spectra, and we have worked with other groups to adapt their programs to use data defined in this standard way. It is now possible to determine the structure of a protein by NMR spectroscopy, using only programs that use data in this standard form. We have additionally written programs to make it easier to write and maintain the data standard and its program libraries, and we have arranged workshops and annual conferences. In this new application we wish to expand the work we have begun to cover a wider area and to make it useful for more people. The data standard needs to have more program libraries, so that it can also be used by people who want to work in programming languages like C, C++ or Perl. It also has to work better with databases and with very large projects that generate lots of data every day. The data standard should be expanded to cover newer NMR methods, NMR data-processing, structure determination by X-ray crystallography, and other ways of analysing proteins. Our NMR analysis program should be expanded to cover the additional NMR disciplines, more tasks and different ways of working. We will use the results to work with other groups, making their programs work with the data standard. We have contacted many groups that write programs for NMR spectroscopy, and we hope that eventually many of the programs that people use in NMR spectroscopy will work together using the CCPN data standard. We are also talking to groups that work in related areas of biophysics - bioinformatics, protein target selection, large-scale production of proteins, biophysical analysis of proteins, large-scale structure determination, metabolite analysis, drug candidate screening, and analysis of the huge amounts of data all this generates - and we work to make sure that the programs that are used in NMR spectroscopy and in related areas of biophysics can all talk to each other.

Technical Summary

CCPN is a Collaborative Computing Project for the macromolecular NMR community. Its purpose is to create data standards, promote software integration, produce and disseminate software, and organise conferences and workshops. Over the period of this grant, CCPN intends to: 1) Extend and improve the core software platform, 2) Broaden the macromolecular NMR software suite by expanding our NMR analysis program and collaborating with other groups to integrate programs from European NMR software developers, 3) Integrate the NMR software with related biology efforts, and 4) Continue to organise workshops and study weekends for the UK NMR community. We will extend our NMR analysis program into the areas of (semi)automatic assignment, ligand screening, dynamics and conformers/excited states of proteins, and solid state NMR. The CCPN integrated software also needs to encompass more of the popular NMR software packages as well as tools being developed through the EU projects EXTEND-NMR and EU-NMR. In addition, biologists would benefit greatly by having the same degree of software integration extended to other areas such as bioinformatics, protein production, biophysical characterisation, X-ray crystallography and metabolomics. The data modelling work and its associated code generation machinery represent a considerable investment that can be used to cover a much larger scientific area. We propose to continue to improve the core software platform, to collaborate on the development of Laboratory Information Management Systems (LIMS), and to extend our data model to support the new efforts. We will continue to organize meetings. Our popular yearly meeting for the UK NMR community serves to discuss and disseminate the best practice for determination of macromolecular structures by NMR. We shall also continue our series of workshops discussing software integration, standards and new techniques, and organise courses on the use of our NMR and data modeling software.

Publications

10 25 50

publication icon
Fogh RH (2006) A nomenclature and data model to describe NMR experiments. in Journal of biomolecular NMR

publication icon
Fogh RH (2010) MEMOPS: data modelling and automatic code generation. in Journal of integrative bioinformatics

publication icon
Gutmanas A (2015) NMR Exchange Format: a unified and open standard for representation of NMR restraint data. in Nature structural & molecular biology

publication icon
Penkett CJ (2010) Straightforward and complete deposition of NMR data to the PDBe. in Journal of biomolecular NMR

publication icon
Ragan TJ (2015) Analysis of the structural quality of the CASD-NMR 2013 entries. in Journal of biomolecular NMR

publication icon
Skinner SP (2015) Structure calculation, refinement and validation using CcpNmr Analysis. in Acta crystallographica. Section D, Biological crystallography

publication icon
Stevens TJ (2011) A software framework for analysing solid-state MAS NMR data. in Journal of biomolecular NMR

 
Description CCPN, an ongoing collaborative computing project, works on software for macromolecular NMR spectroscopy and structure determination. This is a computer-intensive field, involving large amounts of data. Each study is analysed through several independent steps of automatic and interactive programs, and detailed results are deposited in public databases. It has been difficult for programs to exchange data, and in general to maintain consistency and control for the large amounts of data. CCPN produces and maintains:
- The CCPN data exchange standard for NMR and macromolecular structure, that allows data to be exchanged, organised, and stored; extensive subroutine libraries to support programs using the data standard; the MEMOPS code generation framework to generate and maintain synchronized subroutine libraries in different computing languages.
- The CcpNmr suite of programs for visualization and analysis of macromolecular NMR spectra, distributed for Windows, Linux and Mac computers, with facilities for a broad range of tasks, and converting data from external formats.
- Collaboration on integrating third-party NMR software, so that programs can be run easily from a single starting point and can make use of results from each other.
During the relvant granting period, CCPN has released the CcpNmr suite. The largest program is CcpNmr Analysis, for display, interactive assignment and analysis of NMR spectra. Another important program is CcpNmr FormatConverter, which uses the CCPN data standard as a stepping stone to convert to and from over 30 data formats for NMR and structural biology. The CcpNmr suite was released for Linux, Windows, and Mac OSX. The data standard was extended with supporting subroutines in Java, backed by either XML files or relational databases for storage. Residue templates for all monomers known to the PDB were prepared and released, with a system to keep the set updated in the future.
CcpNmr FormatConverter played a key role in the DOCR/FRED data reclamation project, that extracted structures and NMR restraints deposited to the PDB and BMRB and made them available in usable form.
CCPN has collaborated with Bruker Spectrospin on CCPN data export directly from the Bruker TOPSPIN program. We have collaborated with the Nilges and Vuister groups to set up integration with the ARIA structure detemination program and the QUEEN validation suite, so that data can be selected for calculation from CcpNmr Analysis. We have spent a great deal of effort collaborating with the PIMS project on a laboratory information management system. The collaboration included changing the CCPN data model and code generation framework to be more PIMS-friendly, and jointly expanding the data standard to cover the requirements of a LIMS system, such as sample production and tracking, biochemical workflow, and target selection. We have further set up the CCPNGRID web server, allowing external groups to run ARIA and Queen calculations remotely.
CCPN has run a number of courses for programmers and CcpNmr users, and has aranged three UK NMR conferences specifically aimed at introducing young researchers to best practice in the field. Our user base includes four international companies as paying customers.
Exploitation Route As of 2014 the CCPN project is continuing to support and develop the results obtained so far, financed by a mixture of research council funding, user fees, and paid cooperation agreements with both industrial (Novartis) and academic groups (Imperial College, London). The yearly UK NMR conference and the program of user courses and tutorials both continue to be supported, and the number of users keeps growing, lately with the addition of a number of research groups at Imperial College.
CCPN is currently engaged in a major rewrite of the entire CcpNmr Analysis software suite, version 3, with the aim
- To upgrade to a modern graphics package that allows for a more powerful user interface.
- To simplify working with the program for all but the most complex problems, making it more attractive and easier to learn and use.
- To make the program easier to extend, also for users without a computing background.
- To simplify future maintenance by increasing modularity, simplicity, and increased use of standard software libraries.
CCPN is part of a number of collaborations that aim to bring in external groups to develop and maintain modules within the CCpNmr version 3 framework, in fields like solid state NMR analysis. We also have a central role in the efforts to standardize the deposition of NMR data, and are coordinating the development of an agreed data storage format for NMR-related software packages.
Sectors Chemicals,Healthcare,Pharmaceuticals and Medical Biotechnology

URL http://www.ccpn.ac.uk/ccpn
 
Description As an ongoing collaborative computing project, CCPN serves as promoter of an NMR community with a common computing environment. CCPN maintains the data exchange standard for the field, has a central role in collaborations to deliver software integration and pipelines, and maintains the only fully capable long-lived NMR analysis platform. These efforts feed into making NMR analysis tools more capable, easier and faster to use, increasing the productivity of the scientists that use them. The impact of CCPN follows from this increased productivity. One measure of impact is simply the take-up of the CcpNmr software suite. As of end 2011 the major program, CcpNmr Analysis, has about 1000 users at over 150 sites worldwide, with CcpNmr FormatConverter used even more widely. The CcpNmr suite is emerging as the dominant program for solid state NMR anlysis, and has recently been extended to metabolomics. CCPN software and data standards play a key role in the deposition of data at the wwDPB and BioMagResBank, and in the remediation and re-use of already deposited data. CCPN is a key member of WeNMR, the European collaboration providing Grid computing and software pipelines for NMR. The impact is not limited to academia. Results deposited in public databases are a crucial resource also for companies in the field of biochemistry. As of end 2011 nine companies, mainly large multinationals, are paying to use the CcpNmr software suite for pharmaceutical or agrochemical research. Many of these are actively collaborating in the development of the new CcpNmr program for drug candidate screening by NMR. CCPN plays an important role in training future researchers in the field of NMR, both through its conferences, designed to introduce young researchers to best practice in the NMR field, and through training courses run by CCPN. One private company, SpronkNMR of Vilnius, Lithuania, is running courses in the use of CcpNmr software on a commercial basis.
First Year Of Impact 2011
Sector Chemicals,Healthcare,Pharmaceuticals and Medical Biotechnology
Impact Types Economic