Data mining epidemiological relationships: integration of causal analysis with published evidence
Lead Research Organisation:
University of Bristol
Department Name: UNLISTED
Abstract
Causal inference in epidemiology focuses on identifying the risk factors that cause disease. Established approaches focus on specific risk factors that may impact on specific diseases. However, the wealth of biomedical data that now exist enable us to assess the causal relationships between a broad network of risk factors and diseases. By considering a much wider network of such relationships we will establish the relative importance of different risk factors and the potential side-effects of interventions that target those risk factors. We will also integrate biological data (eg molecular pathways, drug targets) with causal relationships to enable us to understand the molecular mechanisms that lead to disease, and identify potential pharmaceutical and public health interventions. These data and relationships will be combined in a purpose-built “graph” database and methods will be developed to mine for novel causal risk factors and potential interventions.
The data that we collate for our research within this programme will have wide-reaching value to the research community. We will provide an open and accessible software platform for other researchers to search and use the various datasets we have integrated for their own research.
The data that we collate for our research within this programme will have wide-reaching value to the research community. We will provide an open and accessible software platform for other researchers to search and use the various datasets we have integrated for their own research.
Technical Summary
Background: The increasing availability of complex, high-dimensional epidemiological data necessitates innovative and scalable approaches to harness this power to address research questions of biomedical importance.
Aims: Motivated by the widespread adoption of Mendelian randomization and the opportunities to integrate multiple data sources for the triangulation of evidence in epidemiological research, this programme will develop and apply novel data mining approaches in integrative epidemiology. We will also develop and implement a software platform to enable research questions of major epidemiological importance to be addressed rapidly and at scale.
The programme will focus on (a) integration of cutting edge statistical methods under development in the MRC Integrative Epidemiology Unit (MRC-IEU) with extensive data in a graph database; (b) development of subgraph searching algorithms; and (c) identification of causal mechanistic pathways to disease. EpiGraphDB will be a resource of extensive value to the programme, the MRC-IEU and the wider research community.
Research plans: The programme will implement a data mining approach by developing a new graph database (EpiGraphDB) that will integrate cutting edge causal analysis evidence with comprehensive data on relationships between traits, risk factors, biomarkers, intervention targets and diseases. These data will originate from Mendelian randomization, genetic and observational correlation from epidemiological studies, relationships mined from the literature, and a wide array of bioinformatics sources describing molecular relationships. EpiGraphDB will enable aetiological hypotheses to be generated and explored.
Data sharing and health applications: The database, software and results generated by this programme will be made openly available to the wider scientific community for application to a range of potential health questions (eg identifying causal risk factors for disease, identifying side-effects of interventions, etc).
Aims: Motivated by the widespread adoption of Mendelian randomization and the opportunities to integrate multiple data sources for the triangulation of evidence in epidemiological research, this programme will develop and apply novel data mining approaches in integrative epidemiology. We will also develop and implement a software platform to enable research questions of major epidemiological importance to be addressed rapidly and at scale.
The programme will focus on (a) integration of cutting edge statistical methods under development in the MRC Integrative Epidemiology Unit (MRC-IEU) with extensive data in a graph database; (b) development of subgraph searching algorithms; and (c) identification of causal mechanistic pathways to disease. EpiGraphDB will be a resource of extensive value to the programme, the MRC-IEU and the wider research community.
Research plans: The programme will implement a data mining approach by developing a new graph database (EpiGraphDB) that will integrate cutting edge causal analysis evidence with comprehensive data on relationships between traits, risk factors, biomarkers, intervention targets and diseases. These data will originate from Mendelian randomization, genetic and observational correlation from epidemiological studies, relationships mined from the literature, and a wide array of bioinformatics sources describing molecular relationships. EpiGraphDB will enable aetiological hypotheses to be generated and explored.
Data sharing and health applications: The database, software and results generated by this programme will be made openly available to the wider scientific community for application to a range of potential health questions (eg identifying causal risk factors for disease, identifying side-effects of interventions, etc).
Organisations
- University of Bristol (Lead Research Organisation)
- Biogen (Collaboration)
- Newcastle University (Collaboration)
- CeMM Research Center for Molecular Medicine (Collaboration)
- Biogen Idec (Collaboration)
- Norwegian University of Science and Technology (NTNU) (Collaboration)
- IMPERIAL COLLEGE LONDON (Collaboration)
- UNIVERSITY OF EXETER (Collaboration)
- Leiden University Medical Center (Collaboration)
- University of Pennsylvania (Collaboration)
- Oracle Corporation (Collaboration)
- HEALTH DATA RESEARCH UK (Collaboration)
- GlaxoSmithKline (GSK) (Collaboration)
- University of Washington (Collaboration)
- KING'S COLLEGE LONDON (Collaboration)
- University of Bristol (Collaboration)
Publications
Ahluwalia TS
(2021)
Genome-wide association study of circulating interleukin 6 levels identifies novel loci.
in Human molecular genetics
Alcala K
(2023)
Kidney Function and Risk of Renal Cell Carcinoma.
in Cancer epidemiology, biomarkers & prevention : a publication of the American Association for Cancer Research, cosponsored by the American Society of Preventive Oncology
Ambatipudi S
(2018)
DNA methylation derived systemic inflammation indices are associated with head and neck cancer development and survival.
in Oral oncology
Anderson E
(2022)
Little genomic support for Cyclophilin A-matrix metalloproteinase-9 pathway as a therapeutic target for cognitive impairment in APOE4 carriers
in Scientific Reports
Armitage JM
(2021)
Peer victimisation during adolescence and its impact on wellbeing in adulthood: a prospective cohort study.
in BMC public health
Baird DA
(2021)
Identifying drug targets for neurological and psychiatric disease via genetics and the brain transcriptome.
in PLoS genetics
Barker ED
(2018)
Inflammation-related epigenetic risk and child and adolescent mental health: A prospective study from pregnancy to middle adolescence.
in Development and psychopathology
Barker R
(2023)
Associations of CTCF and FOXA1 with androgen and IGF pathways in men with localized prostate cancer.
in Growth hormone & IGF research : official journal of the Growth Hormone Research Society and the International IGF Research Society
Related Projects
Project Reference | Relationship | Related To | Start | End | Award Value |
---|---|---|---|---|---|
MC_UU_00011/1 | 31/03/2018 | 30/03/2023 | £2,864,000 | ||
MC_UU_00011/2 | Transfer | MC_UU_00011/1 | 31/03/2018 | 30/03/2023 | £965,000 |
MC_UU_00011/3 | Transfer | MC_UU_00011/2 | 31/03/2018 | 30/03/2023 | £1,011,000 |
MC_UU_00011/4 | Transfer | MC_UU_00011/3 | 31/03/2018 | 30/03/2023 | £1,329,000 |
MC_UU_00011/5 | Transfer | MC_UU_00011/4 | 31/03/2018 | 30/03/2023 | £1,254,000 |
MC_UU_00011/6 | Transfer | MC_UU_00011/5 | 31/03/2018 | 30/03/2023 | £1,640,000 |
MC_UU_00011/7 | Transfer | MC_UU_00011/6 | 31/03/2018 | 30/03/2023 | £1,083,000 |
Title | Reducing drug development costs (animation) |
Description | This short animation explains how we use Mendelian randomization and colocalization to help prioritise drug targets. |
Type Of Art | Film/Video/Animation |
Year Produced | 2020 |
Impact | N/A |
URL | https://youtu.be/t77LZZlF4iw |
Description | Academy of Medical Sciences Springboard Award |
Amount | £99,997 (GBP) |
Funding ID | SBF006\1117 |
Organisation | Academy of Medical Sciences (AMS) |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 07/2021 |
End | 07/2023 |
Description | BHF 4-year PhD programme in Integrated Cardiovascular Science |
Amount | £1,439,856 (GBP) |
Organisation | British Heart Foundation (BHF) |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 09/2021 |
End | 09/2028 |
Description | Biogen collaboration on MR-Base |
Amount | £263,667 (GBP) |
Organisation | Biogen Idec |
Sector | Private |
Country | United States |
Start | 03/2021 |
End | 03/2023 |
Description | CRUK Integrative Cancer Epidemiology Programme |
Amount | £7,715,113 (GBP) |
Funding ID | C18281/A29019 |
Organisation | Cancer Research UK |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 09/2020 |
End | 09/2025 |
Description | CSC - Bristol PhD studentship |
Amount | £150,400 (GBP) |
Funding ID | 202008320304 |
Organisation | Chinese Scholarship Council |
Sector | Charity/Non Profit |
Country | China |
Start | 09/2020 |
End | 09/2024 |
Description | Developing cross-population Mendelian randomization for generalizing evidence on drug targets: the MRC Cross- Population Mendelian Randomization Network |
Amount | £118,937 (GBP) |
Funding ID | MC_PC_21018 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2022 |
End | 12/2023 |
Description | Joint MRC units workshop integrating genetics with g target prioritisation |
Amount | £68,000 (GBP) |
Funding ID | MC_PC_20042 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2021 |
End | 03/2021 |
Description | MICA: NURTuRE - changing the landscape of renal medicine to foster a unified approach to stratified medicine |
Amount | £2,589,391 (GBP) |
Funding ID | MR/R013942/1 |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 06/2018 |
End | 07/2024 |
Description | Molecular Genetic and Lifecourse Epidemiology |
Amount | £5,153,712 (GBP) |
Funding ID | 218495/Z/19/Z |
Organisation | Wellcome Trust |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 08/2020 |
End | 09/2028 |
Description | NIHR Bristol Biomedical Research Centre |
Amount | £11,694,330 (GBP) |
Organisation | National Institute for Health Research |
Sector | Public |
Country | United Kingdom |
Start | 12/2022 |
End | 11/2027 |
Description | Turing Fellowship |
Amount | £9,990 (GBP) |
Organisation | Alan Turing Institute |
Sector | Academic/University |
Country | United Kingdom |
Start | 09/2018 |
End | 09/2020 |
Title | DrivR-Base |
Description | DrivR-Base is a pipeline for extracting feature information from different databases for single nucleotide variants (SNVs). These features are designed to be inputs for machine learning models, aiding in the prediction of functional impacts of genetic variants in human genome sequencing. |
Type Of Material | Computer model/algorithm |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | This is forming the basis of ongoing work for variant effect prediction (in preparation for publication) |
URL | https://github.com/amyfrancis97/DrivR-Base |
Title | EpiGraphDB |
Description | EpiGraphDB is a database of epidemiological relationships, including causal estimates from Mendelian randomization, genetic correlations, literature-derived relationships, and links to biological pathway data, drug targets and others. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | This is due for open release in Q2 2019. The database includes pre-computed causal estimates for a wide range of risk factors on many disease phenotypes and outcomes. The risk factors include potential drug targets, and the platform is currently being used by our collaborators from the pharmaceutical industry to evaluate potential drug targets. |
URL | http://www.epigraphdb.org/ |
Title | GoDMC mQTL database |
Description | Database of methylation quantitative trait loci (mQTL) due to be openly released on publication of the GoDMC consortium paper. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | No |
Impact | This is the largest mQTL analysis to date, providing genetic instruments for use in Mendelian randomization analyses of DNA methylation. |
URL | http://mqtldb.godmc.org.uk/ |
Title | IEU OpenGWAS database |
Description | This is a database of genome-wide association study data summary statistics implemented using ElasticSearch in Oracle Cloud. It was built using data originally collected and curated for the MR-Base web application (http://www.mrbase.org) |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The new architecture of this database makes it significantly faster, supporting a much wider range and larger scale of analyses. |
URL | https://gwas.mrcieu.ac.uk |
Description | Biogen collaboration |
Organisation | Biogen Idec |
Country | United States |
Sector | Private |
PI Contribution | We are continuing a previous collaboration with Biogen focused on drug target prioritization using Mendelian randomization and genetic colocalization with molecular QTL datasets. |
Collaborator Contribution | Biogen are providing funding, pharmaceutical expertise and datasets relevant to their target areas. |
Impact | N/A |
Start Year | 2021 |
Description | CHS proteome MR working group |
Organisation | University of Washington |
Country | United States |
Sector | Academic/University |
PI Contribution | Researchers in my team are contributing expertise in proteome Mendelian randomization and genetic colocalization |
Collaborator Contribution | Our partners are contributing expertise in proteomics and cardivascular disease and relevant datasets |
Impact | N/A |
Start Year | 2020 |
Description | CUP-Global |
Organisation | Imperial College London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | We are collaborating with the Global Cancer Update Programme (CUP-Global) team on processes to automate the processes of systematic review used in the CUP project. |
Collaborator Contribution | The CUP-Global team are providing information on the challenges of information extraction from the literature, and human-curated training datasets. |
Impact | None yet |
Start Year | 2023 |
Description | CVD-COVID-UK |
Organisation | Health Data Research UK |
Country | United Kingdom |
Sector | Private |
PI Contribution | Analyses on the potential role of drug targets in COVID-19 |
Collaborator Contribution | This is a HDR-UK consortium with wide contributions from partners in terms of data, expertise, analyses and technologies. |
Impact | N/A |
Start Year | 2020 |
Description | Genetics of DNA Methylation Consortium |
Organisation | CeMM Research Center for Molecular Medicine |
Country | Austria |
Sector | Academic/University |
PI Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES) |
Collaborator Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from other studies. |
Impact | Multi-disciplinary collaboration involving molecular epidemiology, statistics and bioinformatics. Outputs: Database of methylation QTL: http://mqtldb.godmc.org.uk/ Publication pending |
Start Year | 2013 |
Description | Genetics of DNA Methylation Consortium |
Organisation | King's College London |
Department | Brain Bank |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES) |
Collaborator Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from other studies. |
Impact | Multi-disciplinary collaboration involving molecular epidemiology, statistics and bioinformatics. Outputs: Database of methylation QTL: http://mqtldb.godmc.org.uk/ Publication pending |
Start Year | 2013 |
Description | Genetics of DNA Methylation Consortium |
Organisation | Leiden University Medical Center |
Country | Netherlands |
Sector | Academic/University |
PI Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES) |
Collaborator Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from other studies. |
Impact | Multi-disciplinary collaboration involving molecular epidemiology, statistics and bioinformatics. Outputs: Database of methylation QTL: http://mqtldb.godmc.org.uk/ Publication pending |
Start Year | 2013 |
Description | Genetics of DNA Methylation Consortium |
Organisation | Newcastle University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES) |
Collaborator Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from other studies. |
Impact | Multi-disciplinary collaboration involving molecular epidemiology, statistics and bioinformatics. Outputs: Database of methylation QTL: http://mqtldb.godmc.org.uk/ Publication pending |
Start Year | 2013 |
Description | Genetics of DNA Methylation Consortium |
Organisation | University of Bristol |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES) |
Collaborator Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from other studies. |
Impact | Multi-disciplinary collaboration involving molecular epidemiology, statistics and bioinformatics. Outputs: Database of methylation QTL: http://mqtldb.godmc.org.uk/ Publication pending |
Start Year | 2013 |
Description | Genetics of DNA Methylation Consortium |
Organisation | University of Exeter |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from the Accessible Resource for Integrated Epigenomics Studies (ARIES) |
Collaborator Contribution | Contributing to a consortial analysis of methylation quantitative trait loci using data from other studies. |
Impact | Multi-disciplinary collaboration involving molecular epidemiology, statistics and bioinformatics. Outputs: Database of methylation QTL: http://mqtldb.godmc.org.uk/ Publication pending |
Start Year | 2013 |
Description | IEU/HUNT collaboration |
Organisation | Norwegian University of Science and Technology (NTNU) |
Country | Norway |
Sector | Academic/University |
PI Contribution | Mendelian randomization, genetic and molecular epidemiology applied to UK Biobank |
Collaborator Contribution | Mendelian randomization, genetic and molecular epidemiology applied to the HUNT study |
Impact | N/A |
Start Year | 2019 |
Description | IEU/UPenn collaboration |
Organisation | University of Pennsylvania |
Country | United States |
Sector | Academic/University |
PI Contribution | Mendelian randomization projects: conception, design, analysis and interpretation |
Collaborator Contribution | Mendelian randomization projects: conception, design, data and compute resources and interpretation |
Impact | Multi-disciplinary, integrating clinical, epidemiological and informatics expertise. Outputs: doi: 10.1007/s00125-022-05653-1 |
Start Year | 2019 |
Description | MR-Base collaboration |
Organisation | Biogen |
Country | United Kingdom |
Sector | Private |
PI Contribution | We are collaborating with GlaxoSmithKline and Biogen on the further development and enhancement of the MR-Base platform, with a particular focus on the evaluation of potential drug targets. |
Collaborator Contribution | The industry partners are providing scientific input on the project and advising on who to maximise translational value of the MR-Base platform. |
Impact | Outputs/outcomes: * expansion of the database underlying MR-Base. Papers: * Baird DA, Liu JZ, Zheng J, Sieberts SK, Perumal T, Elsworth B, Richardson TG... AMP-AD eQTL working group . (2021). Identifying drug targets for neurological and psychiatric disease via genetics and the brain transcriptome.. PLoS genetics, 17 (1), pp. e1009224 * Zheng J, Haberland V, Baird D, Walker V, Haycock PC, Hurle MR, Gutteridge A... Gaunt TR. (2020). Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases.. Nature genetics, 52 (10), pp. 1122-1131 |
Start Year | 2017 |
Description | MR-Base collaboration |
Organisation | GlaxoSmithKline (GSK) |
Country | Global |
Sector | Private |
PI Contribution | We are collaborating with GlaxoSmithKline and Biogen on the further development and enhancement of the MR-Base platform, with a particular focus on the evaluation of potential drug targets. |
Collaborator Contribution | The industry partners are providing scientific input on the project and advising on who to maximise translational value of the MR-Base platform. |
Impact | Outputs/outcomes: * expansion of the database underlying MR-Base. Papers: * Baird DA, Liu JZ, Zheng J, Sieberts SK, Perumal T, Elsworth B, Richardson TG... AMP-AD eQTL working group . (2021). Identifying drug targets for neurological and psychiatric disease via genetics and the brain transcriptome.. PLoS genetics, 17 (1), pp. e1009224 * Zheng J, Haberland V, Baird D, Walker V, Haycock PC, Hurle MR, Gutteridge A... Gaunt TR. (2020). Phenome-wide Mendelian randomization mapping the influence of the plasma proteome on complex diseases.. Nature genetics, 52 (10), pp. 1122-1131 |
Start Year | 2017 |
Description | Oracle MR-Base collaboration |
Organisation | Oracle Corporation |
Department | Oracle Corporation UK Ltd |
Country | United Kingdom |
Sector | Private |
PI Contribution | We implemented an ElasticSearch database in Oracle Cloud using credits provided by Oracle. We then transferred data from the IEU GWAS database into this system and connected it to the IEU OpenGWAS database (https://gwas.mrcieu.ac.uk) for use by the wider research community. |
Collaborator Contribution | Oracle provided free credits and support with configuration and optimisation of a virtual cluster to support our ElasticSearch database. |
Impact | IEU GWAS database: https://gwas.mrcieu.ac.uk |
Start Year | 2018 |
Title | ASQ |
Description | The EpiGraphDB-ASQ (ASQ; /??sk/ i.e. "ask") interface is a natural language interface to query the integrated epidemiological evidence of the EpiGraphDB data and ecosystem. The starting point of the query is either a short paragraph of text from which ASQ will derive and extract claim triples, or users can supply those claim triples directly. ASQ will retrieve data from EpiGraphDB, both biomedical entities and evidence from various sources, to faciliate the triangulation of the evidence regarding a specific claim. |
Type Of Technology | Webtool/Application |
Year Produced | 2022 |
Open Source License? | Yes |
Impact | Publication pre-printed and in submission |
URL | https://asq.epigraphdb.org/ |
Title | EpiGraphDB |
Description | EpiGraphDB is an analytical platform and database to support data mining in epidemiology. The platform incorporates a graph of causal estimates generated by systematically applying Mendelian randomization to a wide array of phenotypes, and augments this with a wealth of additional data from other bioinformatic sources. EpiGraphDB aims to support appropriate application and interpretation of causal inference in systematic automated analyses of many phenotypes. There is also an epigraphdb R package to provide ease of access to EpiGraphDB services. We will refer to epigraphdb as the name of the R package whereas "EpiGraphDB" as the overall platform. |
Type Of Technology | Webtool/Application |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | The database includes data from our systematic proteome-wide analysis of potential drug targets (published in Nature Genetics, 2020), which has been widely accessed by researchers from around the world. |
URL | https://epigraphdb.org/ |
Title | MELODI Presto |
Description | The field of literature based discovery is growing in step with the volume of literature being produced. From modern natural language processing algorithms to high quality entity tagging, the methods and their impact are developing rapidly. One annotation object that arises from these approaches, the subject-predicate-object triple, is proving to be very useful in representing knowledge. We have implemented efficient search methods and an application programming interface (API), to create fast and convenient functions to utilize triples extracted from the biomedical literature by SemMedDB. By refining these data we have identified a set of triples that focus on the mechanistic aspects of the literature, and provide simple methods to explore both enriched triples from single queries, and overlapping triples across two query lists. |
Type Of Technology | Webtool/Application |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | N/A |
URL | https://melodi-presto.mrcieu.ac.uk/ |
Title | MR-Base |
Description | MR-base is a web application and R package providing a range of different methods for two-sample Mendelian randomization, and designed to be used with the IEU GWAS database |
Type Of Technology | Webtool/Application |
Year Produced | 2016 |
Impact | MR-base is being widely used by researchers to perform two-sample MR |
URL | http://www.mrbase.org/ |
Title | MR-Base PheWAS tool |
Description | The MR-Base PheWAS tool allows users to rapidly search the associations of a SNP across all phenotypes represented in the IEU GWAS database (part of the MR-Base platform). |
Type Of Technology | Webtool/Application |
Year Produced | 2018 |
Impact | This is used by researchers as a rapid way of reviewing the associations for a single genetic variant using one of the largest public GWAS databases available. |
URL | http://phewas.mrbase.org/ |
Title | MendelVar |
Description | MendelVar provides a quick overview of possible impact of Mendelian disease-related genes on user's complex phenotype of interest. It returns the details of all known broadly defined Mendelian diseases and their causal genes found in the custom genomic intervals as well as overlapping pathogenic rare mutations responsible for Mendelian disease. Enrichment of Disease Ontology, Human Phenotype Ontology terms among the Mendelian genes gives the researcher an overview of any shared features with their trait of interest, e.g. in terms of anatomy. |
Type Of Technology | Webtool/Application |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | Openly accessible to the research community |
URL | https://mendelvar.mrcieu.ac.uk/ |
Title | Vectology - exploring biomedical variable relationships using sentence embedding and vectors |
Description | Many biomedical data sets contain variables that are identified by simple, and often short, descriptions. Traditionally these would either be manually annotated and/or assigned to ontologies using expert knowledge, facilitating interactions with other data sets and gaining an understanding of where these variables lie in the biomedical knowledge space. With Vectology we utilise sentence embedding methods and convert these variables into vectors, calculated from precomputed models derived from biomedical literature to infer relationships between variables. |
Type Of Technology | Webtool/Application |
Year Produced | 2019 |
Impact | The approach has been utilised in the IEU GWAS database to support identification of related datasets. |
URL | http://vectology.mrcieu.ac.uk/ |
Title | epigraphdb-r: An R package to use EpiGraphDB |
Description | This is an R package designed to access data from EpiGraphDB (using the EpiGraphDB API) to support further analysis. |
Type Of Technology | Software |
Year Produced | 2019 |
Open Source License? | Yes |
Impact | Wider accessibility to EpiGraphDB |
URL | http://www.epigraphdb.org/ |
Title | gwas2vcf |
Description | Tool to map GWAS summary statistics to VCF/BCF with on-the-fly harmonisation to a supplied reference FASTA |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | This has now been adopted by the IEU OpenGWAS project for submission of GWAS summary statistics to the database. |
URL | https://github.com/MRCIEU/gwas2vcf |
Title | ieugwaspy |
Description | The IEU GWAS database comprises over 10,000 curated, QC'd and harmonised complete GWAS summary datasets and can be queried using an API. See here for documentation on the API itself. This Python package package is a wrapper to make generic calls to the API, plus convenience functions for specific queries. |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | N/A |
URL | https://github.com/MRCIEU/ieugwaspy |
Title | pygwasvcf |
Description | pygwasvcf provides a wrapper around pysam and rsidx to parse and query VCF files containing GWAS summary statistics and trait metadata. See also gwasvcf an R package for parsing GWAS-VCF files. |
Type Of Technology | Software |
Year Produced | 2020 |
Open Source License? | Yes |
Impact | This tool has been adopted by the IEU OpenGWAS database to promote the use of a standard GWAS VCF format with datasets downloaded from the database |
URL | https://github.com/MRCIEU/pygwasvcf |
Title | varGWAS |
Description | Software to perform genome-wide association study of SNP effects on trait variance |
Type Of Technology | Webtool/Application |
Year Produced | 2021 |
Open Source License? | Yes |
Impact | This has been used to generate variance QTL on biomarkers in UK Biobank |
Description | A seminar for The Seventh Affiliated Hospital, Sun Yat-sen University |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The seminar is aiming to introduce genetics concepts, data and methods to clinical people in the hospital. We promote the methods and database we built up in Bristol during the seminar. We also tried to setup collaboration after the seminar. |
Year(s) Of Engagement Activity | 2020 |
Description | Elastic Community Conference: Improving the accessibility of 100 billion genetic associations |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Industry/Business |
Results and Impact | Presentation on the implementation of the IEU OpenGWAS database on ElasticSearch in Oracle Cloud (https://gwas.mrcieu.ac.uk). |
Year(s) Of Engagement Activity | 2021 |
URL | https://youtu.be/Okvad9D4kT0 |
Description | Genetic study of proteins is a breakthrough in drug development for complex diseases |
Form Of Engagement Activity | A press release, press conference or response to a media enquiry/interview |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Press release on an innovative genetic study of blood protein level in collaboration with pharmaceutical partners, showcasing a key Nature Genetics paper which demonstrated how genetic data can be used to support drug target prioritisation by identifying the causal effects of proteins on diseases. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.bristol.ac.uk/news/2020/september/genetic-study-of-proteins.html |
Description | Innovation Lab by QTEC, University of Bristol |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | Local |
Primary Audience | Industry/Business |
Results and Impact | One researcher participated in this Innovation Lab even, which helped provide business consultancy to a start-up producing smart socks detecting distress in non-verbal Alzheimer patients. Their task was to advise on how to engage their primary market (care homes). |
Year(s) Of Engagement Activity | 2022 |
Description | Mendelian randomization for African scientists |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Six researchers from the MRC IEU organised a five-day course on Mendelian randomization to African researchers in Kilifi. The aim was to teach participants how to implement Mendelian randomization (MR) and how to use the IEU-developed and open-source MR-Base software platform. The UK researchers and the African scientists also spent time talking about their own research interests, stimulating potential future collaborations. Participant feedback was extremely positive with participants leaving with the skills and knowledge to apply MR in their own research. Some individuals are now planning research visits to the UK, with one having since secured a visiting fellowship and another has made a funding application. |
Year(s) Of Engagement Activity | 2022 |
URL | https://ieureka.blogs.bristol.ac.uk/2023/01/27/genetic-epidemiology-african-scientists/ |
Description | Presentation at ASHG in San Diego - D Baird |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Dr Denis Baird was invited to give a presentation at the annual genetics conference for the American Society of Human Genetics to communicate main findings from research into identifying the genes underlying neurological/psychiatric conditions. The presentation was entitled: Identifying the tissue-specific influence of gene expression on neurological and psychiatric traits: a Mendelian Randomization study on gene expression within the human brain. |
Year(s) Of Engagement Activity | 2018 |
Description | Presentation: "Creating, indexing and hosting 250 billion genetic associations with Elastic" at Elastic Meetup |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | Regional |
Primary Audience | Professional Practitioners |
Results and Impact | One of our researchers gave a presentation on our innovative use of ElasticSearch for the IEU GWAS database (https://gwas.mrcieu.ac.uk) to a Regional Elastic Meetup. |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.meetup.com/South-West-Elastic-Fantastics/events/265525501/ |