FUNCLAN - FUNctional annotations through Conformational Landscape Analysis

Lead Research Organisation: Science and Technology Facilities Council
Department Name: Scientific Computing Department

Abstract

Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.

Technical Summary

We will develop FUNCLAN, a framework to provide comparative analyses of conformations and associated annotations. This will be achieved through major improvements to the superposition software GESAMT and will deliver a robust process for superposing and clustering of macromolecular assemblies, protein chains and ligand-binding sites across the Protein Data Bank archive. We will perform a comprehensive analysis of the clustered molecular entities, link them to experimental validation information and map annotations of biological and biophysical contexts to them. We will use this enriched and integrated data to refine the superposition and clustering processes, provide a representative structure for each cluster and design metrics that can be used to evaluate clustered assemblies, protein chains or ligand-binding sites. The FUNCLAN framework will support superposing macromolecular assemblies, where the challenge is partly due to the possibility of changes in topologies accompanied by changes in the conformation of individual components. The project will tackle challenges unique to the superposition of ligand-binding sites, such as superposing amino acid residues interacting with the same small molecule versus superposing small molecules bound in different binding sites.
The project will provide:
1. Software suite and web server for analysing assemblies, proteins chains and ligand binding sites;
2. High-quality manually curated benchmarking datasets of conformational clusters and their biological and biophysical annotations;
3. A robust and iteratively improved pipeline for superposing macromolecular assemblies, proteins chains and ligand-binding sites;
4. Data standards and evaluation metrics for superposed and clustered molecular entities;
5. Clustered molecular entities linked to their validation information and their biological annotations which will be made available programmatically via API and will be displayed on the PDBe-KB entry page

Publications

10 25 50
 
Title FUNCLAN Module for alignment and superposition of macromolecular complexes 
Description The module performs alignment and superposition of protein complexes in 3 dimensions, under the assumption of medium to high homology of compared structures. While this problem is solved for covalently-linked macromolecular structures (polymeric chains), existing solutions cannot be applied to non-covalently bound complexes because there is no canonical ordering of chains in 3 dimensions. A novel algorithm has been developed such as to avoid N2 complexity arising at naive implementation when all chains are aligned against all across compared complexes. The new algorithm has linear (N) complexity, and, therefore, is suitable for mass-screening ca. 200,000 PDB entries. The algorithm is implemented in C++, optimised for maximum efficiency and is currently tested for sensitivity and selectivity of matches. The application will be used for automatic selection of homologous macromolecular complexes from the PDB for the subsequent coordinate analysis aimed at detecting and classification of conformational changes occurring at protein and ligand binding, as well as from the crystallisation in different symmetry groups. 
Type Of Technology Webtool/Application 
Year Produced 2023 
Open Source License? Yes  
Impact The software is being put in use and no reportable impact has been generated to date 
URL https://gitlab.com/CCP4/gesamt/-/tree/main/dcg-project/FunCLAN