The Collaborative Computational Project for NMR (CCPN): Supporting biomolecular NMR and community driven NMR software development.

Lead Research Organisation: University of Leicester
Department Name: Biochemistry

Abstract

The Collaborative Computing Project for NMR (CCPN) was started in 2000 to improve the interoperability of software for biomolecular NMR and to promote a collaborative community of software users and programmers. Over the past twelve years, the project has produced the CcpNmr suite of software for interactive NMR data analysis and integration and the CCPN data standard for macromolecular NMR. The software is now used worldwide by >1000 users. Through its conferences and workshops, CCPN also promotes best practices in both computational and experimental aspects of NMR.
With this proposal we seek to continue and further expand the CCPN project and its user community. Building upon the CCPN software platform, new and emerging aspects of NMR and integrative structural biology techniques will now be targeted. For the current period, we aim to:
1. Develop a fully implemented version 3 (v3) of the CCPN software package.
2. Promote and expand the user uptake and user development of the software.
3. Facilitate the use and sharing of state-of-the-art NMR software technology.
4. Strengthen the training of people, sharing of knowledge and exchange of best-practice's by the UK NMR community.

Version 3 of the CcpNmr suite has led to new, user-friendly applications like SpecView and ChemBuild. These are the first CCPN programs to use an entirely new graphical system based on the modern, multi-platform Qt libraries. During the new period, we will bring the same approach to the Analysis and FormatExchange programs. We will ensure that v3 components are easy to use, fully documented and thoroughly tested, with dedicated modules for specialized tasks and means to interact with external programs and services. This will provide flexible tools to support scientific collaborations and advance important areas such as solid state NMR, small molecule screening and metabolomics.

To promote CCPN and increase user uptake, together with our partners we will embed CCPN in a broader range of scientific projects, enhancing its impact in the structural biological community. The greatly enhanced flexibility of the new code will promote dedicated program adaptation to the working procedures of individual laboratories. We plan to use the modular capabilities of CcpNmr v3 and a reorganised library of high-level utility functions to allow for more groups to start writing software for their own specific practices. To ease user uptake a dedicated Analysis Lite, optimized for simplicity rather than for complex problems, will be developed. The CCPN software will be extensively demonstrated through workshops and documentation.

Together with its partners, CCPN will continue the process of software development and integration by means of direct integration of available methods or by using web-based services developed by third parties. Thus, CCPN users will have easy access to a software pipeline for biological NMR, which allows them to proceed smoothly from spectral data via resonance assignment to structure generation and database deposition. CCPN will collaborate with CCISB groups at Harwell to allow NMR to fulfill its role in biophysics and integrated science.

To strengthen the UK NMR community, we will continue the successful series of UK CCPN conferences and teaching programs for (young) researchers. We will also continue the comprehensive help and support for CCPN users and participate in international efforts in knowledge sharing and exchange of best practices by efforts such as WeNMR. We will closely confer with the NMR facilities in Mill Hill, Birmingham and Warwick.

Oversight of the project will be maintained by the CCPN Executive Committee, with representative members from the UK NMR community and CCPN's international Scientific Advisory Board.

Technical Summary

The CCPN software is now well established and founded upon a stable UML-based data model. Custom Python scripts are used to generate software libraries (APIs) in Python, Java and C, using storage in XML for Python/C and SQL for Java. These APIs, with their built-in validity checking and backwards and forwards data compatibility, form the basis of the CcpNmr suite of software. The coming period will bring the development of CcpNmr version 3 (v3), including Analysis, Analysis Lite, FormatExchange and integration with 3rd party programs. The new version aims to provide an improved user experience, greater ease of use, customizability and specialization for particular application areas, in part through the use of the faster, more modern Qt graphics library. The v3 software will also provide for user-specified task-oriented layouts that allow customization of the graphical interface, e.g. for solid-state NMR, small molecules or metabolomics. Together with a new high-level subroutine library, this will allow speedy and user-focused 3rd party development of CCPN- enabled software.
We will work closely with our partners in developing software and tools for NMR and other biophysical techniques, to encourage further integration with CCPN software. The CCPN Data Model will be extended as required and we will train developers in the use of the CCPN software libraries. In collaborative scientific projects, we will participate in new avenues in the areas of data analysis, including structure calculations and the use of chemical shifts. We will further develop tools to facilitate access to remote services on the grid, both via web browsers and from inside Analysis.
For interacting with our user base, we will continue to provide the usual fast user support as well as improve our bug tracking and handling. We will continue to run courses for users, including (new) events organised jointly with WeNMR and regularly interact with relevant stakeholders.

Planned Impact

The CCPN project is a long-standing, highly-regarded, advanced technological software development and community-building project. Its continued development and maintenance is crucial not only for the field of NMR spectroscopy itself, but also for the larger scientific and societal context. The impact of CCPN is increasingly important as CCPN represents the only fully capable long-lived NMR data-standard and analysis platform. CCPN serves as an example of the acknowledged imperative for the UK to be a leader in the global knowledge economy.

The impact extends to a number of areas:
1. Supporting academic research excellence. The CCPN software package is essential for the research of a large number of UK and worldwide NMR scientists active in the biomedical sciences, including many MRC-funded PIs. Continuing support, improvement and expansion of the software with the latest state-of-the-art tools is essential, as the improved tools developed by CCPN and its partners drives their biomedical NMR-based research programs and stimulates the developments in grid-technology, E-science and -omics type research.

2. Support for industrial innovation and research capacity. Both drug discovery and metabolomics require the development of dedicated software tools. By developing an open- source, readily customizable software platform, CCPN supports industrial innovation, in particular for small- and medium- sized companies that alone cannot afford the development of the technologically demanding NMR technique. The continuing industrial support of CCPN underscores this notion.

3. Providing a scientifically well-trained professional workforce. The presence of a well-educated and technologically skilled workforce is crucial for societal development. CCPN provides direct training for graduate students and post-docs who will soon contribute to the core of the UK technological workforce. By interacting with CCPN at conferences and workshops and by using its technological base, they are trained in scientific thinking, problem solving and scientific software development. These skills are useful much beyond the specific problems they were first applied to and are generally applicable to many problems.

4. Driving technological developments that benefit the UK population through improved health-care.
NMR has emerged as a crucial tool in the quest for dedicated, personalized medicine approaches. Not only does NMR contribute to our understanding of structure and interactions of biomolecules in a variety of states (folded, aggregated, unfolded), NMR-assisted drug discovery has also proved to be one of the leading components in the search for new therapeutic agents, not least in methods for fragment-based lead compound identification. Metabolomics presents a second area of technological development of great potential benefit to the general UK public. Accurate, detailed and timely detection of disease is nearly always beneficial for effective treatment and improves the chances of recovery. NMR has emerged as an excellent tool for the analysis of body fluids and for the attempts to elucidate relevant biomarkers. Specific metabolite patterns are starting to be recognised and correlated with specific disease states. Thus, NMR provides for a great potential as diagnostic tool that will benefit the society at large.

5. Education of the general public. Current high-school students present the future generation of potential scientists. The NMR technique is part of the GCE A- level chemistry curriculum. The CCPN software tools, especially ChemBuild and SpecView, provide for an excellent way to familiarise the students with the curriculum objectives. By providing actual NMR data together with suitable tools and hands-on approaches, we will stir the students' imagination.

Publications

10 25 50
 
Description BIS NMR
Geographic Reach National 
Policy Influence Type Participation in a national consultation
URL https://bisgovuk.citizenspace.com/digital/consultation-on-proposals-for-long-term-capital-in
 
Description BIS consultation UKSB
Geographic Reach National 
Policy Influence Type Participation in a national consultation
URL https://bisgovuk.citizenspace.com/digital/consultation-on-proposals-for-long-term-capital-in
 
Description Praticipation in RCUK business case document for NMR
Geographic Reach National 
Policy Influence Type Participation in a national consultation
Impact Information provided for RCUK to fromulate the business case related to the long-term NMR investment. NMR is a crucial technique for both fundamental and applied research in biomolecular sciences, health, biotechnology and material sciences.
 
Description CCPN CYANA integration 
Organisation ETH Zurich
Country Switzerland 
Sector Academic/University 
PI Contribution Jointly developed the CCPN CYANA importer. CCPN implemented the required software on the basis of specification of needs of the partner.
Collaborator Contribution Meier's group provided specification, test data and did the testing.
Impact Incorporation of the software tool in the CcpNmr Analysis 2.4 release
Start Year 2013
 
Description CCPN Parassign integration 
Organisation Leiden University
Country Netherlands 
Sector Academic/University 
PI Contribution CCPN prepared the platform in Analysis-V3 for integration of the PARassign program
Collaborator Contribution Partner Leiden implemented the PARassign interface together with CCPN. Partner Leiden sent its developer to Leicester for face-toface meeting and joint coding sessions.
Impact PARassign plugin distributed with Analysis-V3
Start Year 2013
 
Description CCPN side-chain assignment module 
Organisation Imperial College London
Department Imperial College Trust
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution CCPN implemented the Imperial strategy for side-chain assignments as part of the CcpNmr Analysis package.
Collaborator Contribution Partner provided the expertise, the testing and the allocation of resources to complete the project.
Impact The tool is now part of the CcpNmr Analysis release 2.4 and thus available for the larger scientific community.
Start Year 2010
 
Description CCPN-Farseer integration 
Organisation University of Barcelona
Country Spain 
Sector Academic/University 
PI Contribution The tremendous improvement in efficiency that has obtained by the method developped by the Farseer team, prompted our partner and us to share the tool with the broad NMR community. For that, we are now implementing Farseer in the CCPNMRv3 project. We (i.e. CCPN) have already developed the basic, yet functional plugin allowing the Farseer team to further customise their tool within the CcpNmr Analysis version-3 framework.
Collaborator Contribution The analysis of the NMR spectra ultimately concludes with the generation of peaklists which are none other than text files in our computer hard disk - peaklists contain all the information encoded in the NMR spectra (peak assignments, peak position, height, volume, etc). Scientific projects that rely on NMR to solve biologically relevant problems mainly require the comparison of peaklists from different samples and experimental conditions which translates in restraints calculation and final data plotting; for example, in protein-protein or protein-ligand interactions, domain dynamics, or usage of paramagnetic NMR restraints. Before a peaklist can be used reliably, it needs to be treated according to the projects specifications. The treatment of the peaklist files is traditionally carried out manually by the researcher in spreadsheet documents. Handling peaklists manually is very time consuming and, above all, opens the door to human-prone mistakes. Moreover, obtaining a reliable peaklist often requires an interactive process of analysis and correction, which increases task repetition and time consumption. Additionally, when several projects run in parallel in a laboratory, there is a strong demand for fast, yet reliable, NMR peaklist analysis, data treatment and plotting, so that experimental conclusions can be drawn quickly. The necessity to cope with such issues and demands led us to develop a Python3-based tool that instantly treats peaklists, calculates all the required parameters/restraints and plots the obtained data into analysable results. Our aim is to automatize this heavy, repetitive and dangerous tasks of manually treating, analysing and plotting NMR peaklists derived data; thus, reducing the time consumed from hours/days/weeks to few minutes and avoiding the introduction of mistakes. By now, the most common NMR related tasks in peaklist treatment, parameter calculation and plotting have been implemented. Specifically, analysis of titrations in protein-protein/ligand interactions, representation and analysis of paramagnetic relaxation enhancements (PRE) data and chemical shift perturbation data. We have also prepared a set of plotting templates that can represent the obtained data with clarity and with publication quality; indeed, our most recent publications already present plots generated this way[1,2,3] - from peaklists to publication in one click! To us, the most outstanding characteristic of Farseer is that it allows the analysis and representation of titration/comparison data up to three experimental conditions; which means that a cubic matrix of, for example, 60 related peaklists can be analyzed and correlated simultaneously, generating large amounts of data automatically.
Impact The collaboration has served as the nucleus for fruther engagement, e.g. as a iNext proposal.
Start Year 2016
 
Description CcpNmr-Almost software integration 
Organisation Institute for Research in Biomedicine (IRB)
Department Computational Structural Biology
Country Switzerland 
Sector Charity/Non Profit 
PI Contribution Agreement to integrate the 'Almost' suite for molecular dynamics calculations and NMR structure determination with the CcpNmr analysis and visualization suite. Our team to write code for selecting data within the CcpNmr suite, launching Almost, and re-intregrating the results in the CcpNmr suite and its data model, for one key calculatoin protocol.
Collaborator Contribution Collaborating team to provide expertise, test data, and testing for the Almost program, and to develop interfaces for multiple calculation protocols, once the first functioning protocol has been set up. Future program releases to be coordinated between the two groups.
Impact Collaboration agreement signed, including roadmap for software development and future releases.
Start Year 2013
 
Description NMRbox consultation 
Organisation University of Connecticut
Country United States 
Sector Academic/University 
PI Contribution NMRbox is a resource for biomolecular NMR (Nuclear Magnetic Resonance) software. It provides tools for finding the software you need, documentation and tutorials for getting the most out of the software, and cloud-based virtual machines for executing the software.
Collaborator Contribution We collaborate on all issues regarding software development, outreach and technology sharing.
Impact Partner in CCPN continuation grant application
Start Year 2019
 
Title CcpNmr AnalysisAssign version 3 alpha release 
Description The CcpNmr Analysis package is a program for interactive analysis, data tracking and management, of marcomolecular NMR data, and for integration with other programs in the field, such as data acquisition amd structure gneration engines, 
Type Of Technology Software 
Year Produced 2015 
Open Source License? Yes  
Impact Seven alpha testers tested the program and produced detailed feedback with likes, dislikes and suggestions for further development. 
 
Title CcpNmr AnalysisAssign version 3 beta1 release 
Description We released CcpNmr AnalysisAssign version-3, the latest software release from the Collaborative Computational Project for NMR, for all aspects of NMR data analysis, including liquid- and solid-state NMR data. We include workflow for backbone assignment as an example of the flexibility and simplicity of implementing workflows, as well as the toolkit used to create the necessary graphics for this workflow. The package can be downloaded from www.ccpn.ac.uk/v3-software/downloads and is freely available to all non-profit organisations. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact This software has been designed to be simple, functional and flexible, and aims to ensure that routine tasks can be performed in a straightforward manner. We have designed the software according to modern software engineering principles and leveraged the capabilities of modern graphics libraries to simplify a variety of data analysis tasks. 
URL http://www.ccpn.ac.uk/v3-software
 
Title CcpNmr AnalysisAssign, releases 3.0-beta5 and 3.0.0 
Description Premier software package with unique capabilities for liquid-state and solid-state NMR data analysis. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact All CcpNmr packages jointly cited >2100 times. Used by four industrial subscribers, with two pending. 
URL https://www.ccpn.ac.uk
 
Title CcpNmr AnalysisMetabolomics, release 3.0-beta1 
Description Premier software package with unique capabilities for NMR-based metabolomics data analysis. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact All CcpNmr packages jointly cited >2100 times. Used by four industrial subscribers, with two pending 
URL https://www.ccpn.ac.uk/
 
Title CcpNmr AnalysisScreen, release 3.0-beta2 
Description Premier software package with unique capabilities for NMR-based screening data analysis. 
Type Of Technology Software 
Year Produced 2019 
Open Source License? Yes  
Impact All CcpNmr packages jointly cited >2100 times. Used by four industrial subscribers, with two pending. 
URL https://www.ccpn.ac.uk
 
Title CcpNmr ChemBuild version 1.0 
Description ChemBuild is a program to generate, modify and convert molecular topology descriptions used for (bio)molecular simulations. It allows the users to adjust the templates in an interactive fashion. 
Type Of Technology Software 
Year Produced 2016 
Open Source License? Yes  
Impact Enhances the UK's international standing in research and development. 
URL http://www.ccpn.ac.uk/v3-software
 
Title CppNmr Analysis release 2.4 
Description CcpNmr Analysis version 2.4 with enhanced assignment tools, new restraint calibration tool, new summary tool and new CYANA integration tool. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact Positive user feedback. 
URL http://www.ccpn.ac.uk/software/analysis
 
Title The NMR Exchange Format (NEF) 
Description The NMR Exchange Format (NEF) has been developed in a collaboration between the CCPN, the BioMagResBank, the RCSB, and the main developers of macromolecular NMR software (Peter Guntert (CYANA), Charles Schwieters (XPLOR-NIH), Michael Nilges (ARIA), Torsten Herrmann (UNIO), David Wishart, David Case (AMBER), Guy Montelione (AutoAssign, ASDP)). It covers sequence, chemical shifts, spectra, peak lists, and restraints. The format specification is controlled by consensus of the partners, and all developers have committed to supporting the format as an input/output exchange format. Version 1.0 of the format specification is now stable and fully supported by CCPN, and will be supported by the upcoming release of NMR-STAR (version 3.2.0.1). 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact For over 20 years, efforts to establish seamless NMR data exchange between different programs have failed, relying on conversion between a variety of formats instead with a concomitant risk of information loss or misinterpretation. Efforts to develop universal NMR data converters have been challenged because some formats omit information required by other formats, and full parsing of each software-specific format has proven to be impossible. The current situation hampers the proper archiving and use of biomolecular NMR data, and prevents the routine inclusion of NMR restraint validation in the wwPDB NMR validation pipeline. The new NMR exchange format was developed in close consultation and with support of developers of key software packages used for NMR structure determination and refinement, with the aim of attaining a unified approach to represent NMR restraints and associated data. Together, they agreed on and successfully implemented and tested an NMR data representation and devised a governance structure for its maintenance and further development. The authors of fourteen different packages already committed during the initial discussions, with new ones joining the efforts since. 
URL https://github.com/NMRExchangeFormat/NEF/
 
Description Open-Science day 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact For over three hours, our scientists welcomed students from places such as Groby, Leicester, Northampton, Corby, Wellingborough, Rothwell and Coalville to the Henry Wellcome Building. To showcase the work carried out in the Department there were a series of short talks delivered by leading researchers, hands-on activities, displays, competitions and tours of our research labs.
Over a hundred students from nine schools had the chance to hear about and see our research and talk to our scientists, while having some fun and enjoying the refreshments.

The team showcased the Ccpn program and provided life demonstrations at the NMR spectrometer.
Year(s) Of Engagement Activity 2016
URL http://www2.le.ac.uk/departments/molcellbiol/file-store/science-open-day-2016
 
Description Opinion on Brexit for The Guardian 
Form Of Engagement Activity A press release, press conference or response to a media enquiry/interview
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact I formulated my opinion on the impact of a Brexit, with particular emphasis on the need for knowledge and infrastructure sharing, collaboration and joint technology development. These are all items within the remit of CCPN
Year(s) Of Engagement Activity 2016
URL http://www.theguardian.com/politics/ng-interactive/2015/nov/11/brexit-would-weaken-uk-universities