Gene-expression connectivity mapping and its application in phenotypic targeting

Lead Research Organisation: Queen's University Belfast
Department Name: Centre for Cancer Res and Cell Biology

Abstract

When a small-molecule compound is applied to a biological system (cells, animal, or human), the system interacts with (or responds to) the chemical; as a result the very large number of genes in the system change their expressions in a particular pattern. Modern biotechnologies allow researchers to measure those changes in gene expression using 'microarray chips', generating gene-expression profiles. The proposed research involves the development and application of methods to connect different biological conditions using those gene-expression profiles, based on a novel concept called 'connectivity map'. The connectivity map concept was introduced in 2006 in a paper published in Science with the idea: 1) different biological states have their own characteristic gene-expression profiles, and 2) connections between different biological states can be established base on their gene-expression similarity or dissimilarity. This is a simple yet very powerful idea. Two immediate applications of the connectivity mapping concept are: 1) if two chemical compounds are found to induce similar gene-expression profiles, the knowledge about one compound may help to predict the properties (pharmacological as well as toxicological) of the other; 2) if a compound is found to induce a gene-expression profile opposite to that of a disease state (whether animal disease or human disease), the compound may then be flagged as a potential drug to treat the disease. In the past few years, studies on connectivity mapping including ours have demonstrated the success and promise of this approach, for example, in identifying pro-oestrogen (agonists) and anti-oestrogen (antagonists) chemicals, immunosuppressive drugs, hair-growth agents, etc. We will further develop the method and technique to effectively and accurately map the gene-drug-disease connections. The results from this project will benefit biomedical researchers in their search for small-molecule therapeutics to diseases.

Technical Summary

The project covers three aspects of research in connectivity mapping: 1) Methodological and algorithmic advancement, 2) Novel application, and 3) software development. We will investigate and develop a gene progression method in the construction of gene signatures, first by ranking and sorting genes based on their statistical and biological significance. By progressively incorporating more features, the length of a gene signature is optimised by observing and achieving the desired statistical stringency among the discovered connections. We will develop a gene-signature perturbation method to test the stabilities of discovered connections. Two approaches to gene-signature perturbation will be investigated and their performance compared. The first is a leave-one-gene-out approach to derive shorter signatures. The second is an add-one-gene-in approach by adding an extra randomly selected gene to the signature. The perturbation stability will be assessed by the proportion of persistent significance. This will help further analyse and narrow down these connections in an unbiased manner such that the success rate of any follow-up investigation can be maximised. We will develop two types of applications following 'the general principle of phenotypic targeting'. Using experimentally derived gene signatures, we will investigate phenotypes related cancers in two case studies. Using manually coded gene signatures, we will apply connectivity mapping to the search for agents that will down regulate a small number of selected genes. We will use RAN and c-FLIP down-regulation as case studies. Finally there is a software development component, in which we will implement a parallel computing model of connectivity mapping with the new methods and algorithms developed here. We will use C/C++ and OpenMP as a development platform to create tools that fully exploit the hardware capacities of the increasingly common multi-core desktop computers.

Planned Impact

In addition to academic beneficiaries, biotech and pharmaceutical industry would be interested in the research, as connectivity mapping fits well in suggesting new uses for existing drugs. The regulatory agencies may also be interested in the development of connectivity mapping technology for its potential application in predictive toxicology and consequently reducing the use of animal testing. Identifying new uses for existing drugs becomes an approach of growing popularity with drug developers because of the high-cost and time-consuming cycle associated with new drug discovery. However most remarketed old drugs for new uses so far have been the result of chance observation, the connectivity mapping technique can offer a more targeted screening approach in finding new uses. The longer term economical implication of the development and successful applications of connectivity mapping is time and cost saving for drug developers and for the wider economy. The improved methodology and the statistical rigor we introduced into the connectivity mapping exercise led to increased specificity and sensitivity in identifying meaningful biological connections. Our work has generated a great deal of interest from both academia and industry. To increase further impact, we plan to explore the possibility of exploiting commercially our research results. With the professional support from the Knowledge Exploitation Unit (KEU) at QUB, the applicant plans to undertake consultancy activities in the area of gene-expression connectivity mapping, particularly in the identification of new uses for existing drugs, which is closely related to the proposed research and directly benefits from the findings of this project. These activities will provide an important platform to transfer our knowledge and expertise to industry, to the public and wider community.

Publications

10 25 50

publication icon
Malcomson B (2016) Connectivity mapping (ssCMap) to predict A20-inducing drugs and their antiinflammatory action in cystic fibrosis. in Proceedings of the National Academy of Sciences of the United States of America

 
Description With regard to methodology and algorithmic development, we have established a well tested protocol and work flow for constructing gene expression signatures. We started from gene expression differential analysis comparing typically a disease state versus the corresponding normal condition. Genes were first filtered by their statistical significance with stringent criteria. Significantly differentially expressed genes were then ranked in order of descending importance combining their statistical significance level and the magnitude of differential expression. We established the protocol for the gene signature progression procedure and this was employed in our application to the inhibition of AML phenotype.

As an application of connectivity mapping to phenotypic targeting, we applied this to the inhibition of AML disease phenotype, with two independent studies, one using mouse model of AML, one from human expression datasets. Following the gene signature progression process, we constructed optimal gene signatures representing AML disease states. Connectivity mapping, enabled with our gene signature perturbation method, returned a number of candidate compounds for each study. By overlapping the results from both studies, we identified Entinostat as an effective inhibitor of the AML disease state. Subsequently this was validated in the lab on cell lines and mouse model (Ramsey et al 2013). In this case the prediction from gene expression connectivity mapping formed the basis of a fresh hypothesis which was then validated, leading to new mechanistic insights in the process.

Embracing the fast moving Next Generation Sequencing technology, we investigated the feasibility of incorporating RNA-Seq data into the connectivity mapping framework. Comparing an RNA-Seq dataset to a microarray dataset, the overlap between the two technologies is highly significant. By successfully integrating microarray and RNA-seq datasets with chemical-induced expression profiles, we identified the nicotine derivative cotinine as being able to suppress the proliferative phenotype of prostate cancer cells (McArt et al 2013).

With regard to software development, we have implemented a high performance computing model of connectivity mapping. A GPU base software cudaMap was developed and tested. Current version cudaMap implements the core functionalities of sscMap with GPU enabled computing capacity. We are able to demonstrate dramatic speed differentials between the high performance cudaMap and original sscMap as the computational load increases for high accuracy evaluation of p-values (McArt et al 2013). Results from the analysis of multiple gene signatures, which would previously have taken several days, can now be obtained in as little as 10 minutes, greatly facilitating candidate compounds discovery with high throughput.
Exploitation Route Our research in this project has generated a lot of interests from colleagues here at QUB and international collaborators. We have received a number of collaboration requests to apply our developed connectivity mapping methods to different systems and different research topics. The preliminary results from these collaborations formed the basis of several grant proposals that we developed with colleagues. So far we have secured two major grants that use our connectivity mapping approach as its essential component. Colleagues and collaborators are therefore already benefiting from the research output of the project, demonstrating the impact our research on academic beneficiaries.
Sectors Healthcare,Pharmaceuticals and Medical Biotechnology,Other

 
Description The findings of our research formed the background expertise that is being utilized in an international consortium on category approaches and read-cross in regulatory programmes.
First Year Of Impact 2015
Sector Chemicals,Environment
Impact Types Policy & public services

 
Description A pilot study to holistically target dysfunctional pathways in cystinosis: Drug repurposing with gene expression connectivity mapping
Amount € 10,000 (EUR)
Organisation Cystinosis Ireland 
Sector Charity/Non Profit
Country Ireland
Start 07/2019 
End 03/2020
 
Description Golden Anniversary Research Programme project grant
Amount £104,000 (GBP)
Organisation Leukaemia & Lymphoma NI 
Sector Charity/Non Profit
Country United Kingdom
Start 12/2014 
End 11/2016
 
Description New Technologies for Category Approaches and Read-across
Amount € 2,250,000 (EUR)
Organisation European Oil Company Organisation for Environment, Health and Safety 
Sector Charity/Non Profit
Country Belgium
Start 07/2016 
End 06/2018
 
Description Population Research Committee (PRC) Project
Amount £194,473 (GBP)
Funding ID C37316/A18225 
Organisation Cancer Research UK 
Sector Charity/Non Profit
Country United Kingdom
Start 12/2014 
End 11/2017
 
Description Scientific Research Grants
Amount £68,500 (GBP)
Organisation Northern Ireland Chest Heart and Stroke Association (NICHS) 
Sector Charity/Non Profit
Country United Kingdom
Start 12/2015 
End 11/2017
 
Description An EU-UK-USA consortium on category approach and Read-across in Regulatory Programmes 
Organisation European Oil Company Organisation for Environment, Health and Safety
Country Belgium 
Sector Charity/Non Profit 
PI Contribution Our research team contributes to the connectivity mapping analyses of gene expression profiles generated by partners of this consortium.
Collaborator Contribution Our partners in this consortium perform cell based gene expression data in the labs for subsequent analyses including connectivity mapping which is an area of expertise of my own team.
Impact Active collaboration: data are being generated in the labs.
Start Year 2015
 
Description An EU-UK-USA consortium on category approach and Read-across in Regulatory Programmes 
Organisation North Carolina State University
Country United States 
Sector Academic/University 
PI Contribution Our research team contributes to the connectivity mapping analyses of gene expression profiles generated by partners of this consortium.
Collaborator Contribution Our partners in this consortium perform cell based gene expression data in the labs for subsequent analyses including connectivity mapping which is an area of expertise of my own team.
Impact Active collaboration: data are being generated in the labs.
Start Year 2015
 
Description An EU-UK-USA consortium on category approach and Read-across in Regulatory Programmes 
Organisation Public Health England
Country United Kingdom 
Sector Public 
PI Contribution Our research team contributes to the connectivity mapping analyses of gene expression profiles generated by partners of this consortium.
Collaborator Contribution Our partners in this consortium perform cell based gene expression data in the labs for subsequent analyses including connectivity mapping which is an area of expertise of my own team.
Impact Active collaboration: data are being generated in the labs.
Start Year 2015
 
Description An EU-UK-USA consortium on category approach and Read-across in Regulatory Programmes 
Organisation Texas A&M University
Country United States 
Sector Academic/University 
PI Contribution Our research team contributes to the connectivity mapping analyses of gene expression profiles generated by partners of this consortium.
Collaborator Contribution Our partners in this consortium perform cell based gene expression data in the labs for subsequent analyses including connectivity mapping which is an area of expertise of my own team.
Impact Active collaboration: data are being generated in the labs.
Start Year 2015
 
Title GPU based high performance computing solution for gene expression connectivity mapping 
Description Our new software, cudaMap, implemented using CUDA C/C++, harnesses the computational power of NVIDIA GPUs (Graphics Processing Units) to greatly reduce processing times for gene expression connectivity mapping. cudaMap makes a strong contribution in the discovery of candidate therapeutics by enabling speedy execution of heavy duty connectivity mapping tasks, which are increasingly required in modern cancer research. 
Type Of Technology Software 
Year Produced 2013 
Open Source License? Yes  
Impact A highly accessed software on the BMC website. Top 10 accessed software paper in the month following publication. 
URL http://purl.oclc.org/NET/cudaMap
 
Title QUADrATiC: scalable gene expression connectivity mapping for repurposing FDA-approved therapeutics 
Description QUADrATiC is a user-friendly tool for the exploration of gene expression connectivity on the subset of the LINCS data set corresponding to FDA-approved small molecule compounds. It enables the identification of compounds for repurposing therapeutic potentials. The software is designed to cope with the increased volume of data over existing tools, by taking advantage of multicore computing architectures to provide a scalable solution, which may be installed and operated on a range of computers, from laptops to servers. This scalability is provided by the use of the modern concurrent programming paradigm provided by the Akka framework. The QUADrATiC Graphical User Interface (GUI) has been developed using advanced JavaScript frameworks, providing novel visualization capabilities for further analysis of connections. There is also a web services interface, allowing integration with other programs or scripts. 
Type Of Technology Software 
Year Produced 2016 
Impact Since its publication, the software has attacted a lot of attention from biomedical research and software development communities. The software has now been listed on the omicstools website at the following URL. https://omictools.com/qub-accelerated-drug-and-transcriptomic-connectivity-tool 
URL http://go.qub.ac.uk/QUADrATiC
 
Description COST-Action Training School 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact The presentation generated great interest from the audience, followed by positive feedback.

The talk was invited twice following the very positive feedback from the audience, most of them were postgraduate students and some were undergraduates.
Year(s) Of Engagement Activity 2011,2012
 
Description Cancer Research Centre Open Day "Ask me about my research" Activities 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Over 350 people attended Cancer Research Centre's Open day on Saturday 9 May 2015. It was a hugely successful day with 100% of those responding saying the event was outstanding/very good. As a PI in the Cancer Research Centre, Dr Zhang participated the "Ask me about my research" activities, talking to visitors about the latest research being carried out in our group, and its potential benefits to healthcare.
Year(s) Of Engagement Activity 2015
 
Description Summer School in Computational Biology 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact The presentation generated great interest from audience and follow-up queries afterwards.

Promoted understanding of the research topic. Further requests were received to host placement school students to gain experience in our lab.
Year(s) Of Engagement Activity 2012,2013