UK Infrastructure for Large-scale Clinical Genomics Research
Lead Research Organisation:
Queen Mary University of London
Department Name: UNLISTED
Abstract
This proposal to the MRC will establish a shared, secure, high performance data and compute infrastructure as a platform for large-scale clinical genomics research based on the data flows of the UK 100,000 Genomes project.
Samples and data from patients with cancer and rare, inherited disorders will be provided by NHS England, working in collaboration with Cancer Research UK and programmes funded by the NIHR and the MRC. Genomics England, a company wholly owned by the Department of Health, will pay for the generation of whole genome sequence data.
Genomics England will pay also for the generation of summary reports, based upon clinical annotations of this data, and will return these to the NHS to support patient care. Genomics England will make anonymised, redacted versions of the data available for industrial research strictly within a secure, managed environment.
The proposed infrastructure will provide a similar environment for academic research, with a more comprehensive collection of genomic and patient data, including the read-level data used for the generation of variant calls and summary reports. The infrastructure will include software tools to support the production of 'research-ready' data sets, the effective management of patient and genomic data, and the delivery of collaborative clinical research.
The project partners have experience in infrastructure development and clinical genomics research, and will be able to reuse designs, procedures, and software developed and tested within existing programmes and organisations, including UK
Biobank and the European Bioinformatics Institute.
A formal mechanism will be established for engagement with public, charitable, and philanthropic funders, and with the clinical research projects that they fund. Subject to capacity constraints, projects that add appropriate value to the Genomics England programme will be provided with access to the compute infrastructure at no charge.
Samples and data from patients with cancer and rare, inherited disorders will be provided by NHS England, working in collaboration with Cancer Research UK and programmes funded by the NIHR and the MRC. Genomics England, a company wholly owned by the Department of Health, will pay for the generation of whole genome sequence data.
Genomics England will pay also for the generation of summary reports, based upon clinical annotations of this data, and will return these to the NHS to support patient care. Genomics England will make anonymised, redacted versions of the data available for industrial research strictly within a secure, managed environment.
The proposed infrastructure will provide a similar environment for academic research, with a more comprehensive collection of genomic and patient data, including the read-level data used for the generation of variant calls and summary reports. The infrastructure will include software tools to support the production of 'research-ready' data sets, the effective management of patient and genomic data, and the delivery of collaborative clinical research.
The project partners have experience in infrastructure development and clinical genomics research, and will be able to reuse designs, procedures, and software developed and tested within existing programmes and organisations, including UK
Biobank and the European Bioinformatics Institute.
A formal mechanism will be established for engagement with public, charitable, and philanthropic funders, and with the clinical research projects that they fund. Subject to capacity constraints, projects that add appropriate value to the Genomics England programme will be provided with access to the compute infrastructure at no charge.
Technical Summary
This proposal to the MRC will establish a shared, secure, high performance data and compute infrastructure as a platform for large-scale clinical genomics research based on the data flows of the UK 100,000 Genomes project.
Samples and data from patients with cancer and rare, inherited disorders will be provided by NHS England, working in collaboration with Cancer Research UK and programmes funded by the NIHR and the MRC. Genomics England, a company wholly owned by the Department of Health, will pay for the generation of whole genome sequence data. Genomics England will pay also for the generation of summary reports, based upon clinical annotations of this data, and will return these to the NHS to support patient care. Genomics England will make anonymised, redacted versions of the data available for industrial research strictly within a secure, managed environment. The proposed infrastructure will provide a similar environment for academic research, with a more comprehensive collection of genomic and patient data, including the read-level data used for the generation of variant calls and summary reports. The infrastructure will include software tools to support the production of 'research-ready' data sets, the effective management of patient and genomic data, and the delivery of collaborative clinical research. The project partners have experience in infrastructure development and clinical genomics research, and will be able to reuse designs, procedures, and software developed and tested within existing programmes and organisations, including UK Biobank and the European Bioinformatics Institute. A formal mechanism will be established for engagement with public, charitable, and philanthropic funders, and with the clinical research projects that they fund. Subject to capacity constraints, projects that add appropriate value to the Genomics England programme will be provided with access to the compute infrastructure at no charge.
Samples and data from patients with cancer and rare, inherited disorders will be provided by NHS England, working in collaboration with Cancer Research UK and programmes funded by the NIHR and the MRC. Genomics England, a company wholly owned by the Department of Health, will pay for the generation of whole genome sequence data. Genomics England will pay also for the generation of summary reports, based upon clinical annotations of this data, and will return these to the NHS to support patient care. Genomics England will make anonymised, redacted versions of the data available for industrial research strictly within a secure, managed environment. The proposed infrastructure will provide a similar environment for academic research, with a more comprehensive collection of genomic and patient data, including the read-level data used for the generation of variant calls and summary reports. The infrastructure will include software tools to support the production of 'research-ready' data sets, the effective management of patient and genomic data, and the delivery of collaborative clinical research. The project partners have experience in infrastructure development and clinical genomics research, and will be able to reuse designs, procedures, and software developed and tested within existing programmes and organisations, including UK Biobank and the European Bioinformatics Institute. A formal mechanism will be established for engagement with public, charitable, and philanthropic funders, and with the clinical research projects that they fund. Subject to capacity constraints, projects that add appropriate value to the Genomics England programme will be provided with access to the compute infrastructure at no charge.
Organisations
- Queen Mary University of London (Lead Research Organisation)
- Department of Health Social Services and Public Safety (DHSSPS) (Collaboration)
- Quintiles Transnational Corporation (Collaboration)
- Ludwig Maximilian University of Munich (LMU Munich) (Collaboration)
- UNIVERSITY OF CAMBRIDGE (Collaboration)
Publications
Liu DJ
(2017)
Exome-wide association study of plasma lipids in >300,000 individuals.
in Nature genetics
Arno G
(2017)
Biallelic Mutation of ARHGEF18, Involved in the Determination of Epithelial Apicobasal Polarity, Causes Adult-Onset Retinal Degeneration.
in American journal of human genetics
Van Den Berg ME
(2017)
Discovery of novel heart rate-associated loci using the Exome Chip.
in Human molecular genetics
Sabatine M
(2017)
Evolocumab and Clinical Outcomes in Patients with Cardiovascular Disease
in New England Journal of Medicine
Wu H
(2018)
SemEHR: A general-purpose semantic search system to surface semantic data from clinical notes for tailored care, trial recruitment, and clinical research.
in Journal of the American Medical Informatics Association : JAMIA
Jackson R
(2018)
CogStack - experiences of deploying integrated information retrieval and extraction services in a large National Health Service Foundation Trust hospital.
in BMC medical informatics and decision making
Robbe P
(2018)
Clinical whole-genome sequencing from routine formalin-fixed, paraffin-embedded specimens: pilot study for the 100,000 Genomes Project
in Genetics in Medicine
Description | Chief Scientist Genomics England |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
URL | http://www.genomicsengland.co.uk |
Description | Chief Scientist Genomics England |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
URL | http://www.genomicsengland.co.uk |
Description | Chief Scientist Genomics England |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
Description | Chief Scientist Genomics England |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
URL | http://www.genomicsengland.co.uk |
Description | Chief Scientist Genomics England |
Geographic Reach | National |
Policy Influence Type | Influenced training of practitioners or researchers |
URL | http://www.genomicsengland.co.uk |
Description | Genomics England Newborn screening funded to £100m |
Geographic Reach | National |
Policy Influence Type | Contribution to new or Improved professional practice |
Impact | No impact yet as service only just agreed to be funded. |
URL | https://www.genomicsengland.co.uk/initiatives/newborns |
Description | UK Clinical Genomics Infrastructure: Co-lead for the preparation for commissioning in the NHS of a National Genomic Health service |
Geographic Reach | National |
Policy Influence Type | Membership of a guideline committee |
Description | UK Clinical Genomics Infrastructure: Member of the Topol Review of Digital, Genomics and Artificial Intelligence implications for workforce planning. |
Geographic Reach | National |
Policy Influence Type | Membership of a guideline committee |
Description | COVID-19 (with CCO) |
Amount | £5,000,000 (GBP) |
Organisation | LifeArc |
Sector | Charity/Non Profit |
Country | United Kingdom |
Start | 03/2020 |
End | 03/2021 |
Description | COVID-19 Matched WGS |
Amount | £9,890,000 (GBP) |
Organisation | Illumina |
Sector | Private |
Country | United States |
Start | 03/2020 |
End | 03/2021 |
Description | COVID-19 WGS |
Amount | £3,000,000 (GBP) |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2020 |
End | 03/2021 |
Description | Illumina matched-funds re UK Life Sciences Cancer WGS (Genomics England) |
Amount | £2,250,000 (GBP) |
Organisation | Illumina |
Sector | Private |
Country | United States |
Start | 03/2020 |
End | 03/2022 |
Description | Inward Capital co-investment and 100 science jobs at Illumina |
Amount | £22,000,000 (GBP) |
Organisation | Illumina Inc. |
Sector | Private |
Country | United States |
Start | 03/2020 |
End | 03/2025 |
Description | Long-Read Cancer Sequencing |
Amount | £162,000 (GBP) |
Organisation | Oxford Nanopore Technologies |
Sector | Private |
Country | United Kingdom |
Start | 03/2020 |
End | 03/2022 |
Description | REACT-GE (COVID controls) |
Amount | £1,500,000 (GBP) |
Organisation | Medical Research Council (MRC) |
Sector | Public |
Country | United Kingdom |
Start | 03/2020 |
End | 03/2021 |
Description | UK Clinical Genomics Research Data Infrastructure |
Amount | £2,700,000 (GBP) |
Organisation | Department of Health (DH) |
Sector | Public |
Country | United Kingdom |
Start | 03/2018 |
End | 03/2021 |
Description | UK Life Sciences Cancer WGS (Genomics England) |
Amount | £7,870,000 (GBP) |
Organisation | Innovate UK |
Sector | Public |
Country | United Kingdom |
Start | 03/2020 |
End | 03/2022 |
Title | Genotyping technology |
Description | Taqman genotyping is a main workhorse for SNP genotyping. We adapted a methodology for reaction miniturisation from KBioscience for nanolitre reaction volumes reducing the cost of genotyping by 50% |
Type Of Material | Technology assay or reagent |
Year Produced | 2006 |
Provided To Others? | Yes |
Impact | added value for funders |
Title | High throughput genotyping and sequencing hub |
Description | Barts and The London Genome Centre. Offers high throughput genomics infrastructure to internal and external users including hotel facilities for scientists. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2008 |
Provided To Others? | Yes |
Impact | Multiple major publications in common disease. |
Title | Improved techniques |
Description | The sampling handling approaches and standard operating procedures for phenotyping been used to develop the automated Biobank sample handling system |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2006 |
Provided To Others? | Yes |
Impact | Handling 500000 samples now for UK Biobank |
Title | NHS Genomic Medicine Centres |
Description | I created and established the concept of NHS Genomic Medicine Centres in England which has led to NHS England Commissioning this capacity and capability framework for the 100,000 Genomes Project |
Type Of Material | Improvements to research infrastructure |
Provided To Others? | No |
Impact | Led to NHS England Commissioning this capacity and capability framework for the 100,000 Genomes Project |
Title | Phanotypic and genotypic database |
Description | Initially a microsoft access relational database which we migrated to MySQL database holding all phenotypic and genotypic data for analysis and ease of collaboration. Several other studies have copied or been helped to adapt our approach |
Type Of Material | Biological samples |
Year Produced | 2007 |
Provided To Others? | Yes |
Impact | Others have adopted the database structure for similar phenotypic collections |
Title | The Genomics England Clinical Interpretation Partnership |
Description | We have established, launched and called for expressions of interest to the UK NHS, academics and training to form domains to enhance clinical interpretation of the data from the 100,000 genomes project. |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | Receiving expressions of interest for forming GeCIPs from research community |
Title | The UK Clinical Genomics Infrastructure: Clinical Data |
Description | Improvement to the wider UK Clinical Genomics Infrastructure |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2018 |
Provided To Others? | Yes |
Impact | The infrastructure now holds 1.6 billion data points on 94,000 participants and 91,000 whole genomes and recently cancer registry and mortality data (2141 participants with cause of death). |
URL | https://www.genomicsengland.co.uk/ |
Title | UK Clinical Genomics Research Infrastructure |
Description | Data Centre |
Type Of Material | Improvements to research infrastructure |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | The research data centre for analysis and interpretation of the 100,000 Genomes Project |
Description | Genome Wide Association Study of Lacunar Stroke |
Organisation | University of Cambridge |
Department | Department of Physiology, Development and Neuroscience |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Statistical analysis |
Collaborator Contribution | Data collection, oversight |
Impact | Manuscript in press at Lancet Neurology |
Start Year | 2019 |
Description | Genomics England |
Organisation | Department of Health Social Services and Public Safety (DHSSPS) |
Country | United Kingdom |
Sector | Public |
PI Contribution | I have been Chief Scientist for the 100000 whole genome sequencing programme since 2013. I led and created the consortium that won the grant that creates this data centre for the research component of the 100,000 genomes project. This goes live for the main programme imminently (see further funding) |
Collaborator Contribution | We are completing pilots in rare disease and cancer |
Impact | We have: - returned diagnoses to the NHS - created 13 NHS Genomic Medicine centres across England that serve to enrol, supply clinical data, validate feedback to patients - embarked on the main programme - formed a 12 company consortium to create academic NHS Industry partnerships - 9 HE Institutes now offer a Master's in Genomic Medicine |
Start Year | 2013 |
Description | Immune Mechanisms in Small Vessel Disease |
Organisation | Ludwig Maximilian University of Munich (LMU Munich) |
Country | Germany |
Sector | Academic/University |
PI Contribution | Study design, Primary analysis, study oversight |
Collaborator Contribution | Statistical analysis and interpretation |
Impact | Manuscript under review to BRAIN |
Start Year | 2020 |
Description | Quintiles Prime Site |
Organisation | Quintiles Transnational Corporation |
Country | United States |
Sector | Private |
PI Contribution | I lead the World's first Prime Site which concentrates trials in a single site management organisation at Barts Health and Queen Mary University of London. |
Collaborator Contribution | Our collaboration with Quintiles, now extended across UCLP, created a world-leading trials hub bringing 168 new trials to 3356 UK patients (£ 20m) and leading to creation of 25 similar "Prime Sites" worldwide. |
Impact | It ranges across all therapeutic areas |
Start Year | 2008 |
Description | Chief Scientist for Genomics England - Progress Educational Trust |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | Public meeting on the 100,00 Genomes Project, evoked discussions on the programme and data handling Engagement from patient community |
Year(s) Of Engagement Activity | 2014 |
Description | Chief Scientist for Genomics England Town Hall meetings |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Public/other audiences |
Results and Impact | Presented and co-led 3 of these meetings. Meetings sparked interesting and lively debates on the 100,000 genome project and what it means for patients Project picked up by social media Further events planned |
Year(s) Of Engagement Activity | 2014 |
Description | Genomics England - 100k Genome Project (Multiple National & International Talks 2015-2018) |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Multiple talks about the 100k Genomes project as GEL Chief Scientist |
Year(s) Of Engagement Activity | 2015,2016,2017,2018 |
URL | http://www.genomicsengland.co.uk |
Description | Range of Genomics related talks |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Public/other audiences |
Results and Impact | The future of genomics in the delivery of healthcare |
Year(s) Of Engagement Activity | 2021,2022 |
Description | The Genomics Conversation |
Form Of Engagement Activity | Participation in an activity, workshop or similar |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Activities: •Events took place all over the country and a big thank you goes to the teams in Lincoln, York and Nottingham for their awareness raising activities. •We launched our new nursing video - 'Nursing in the Genomic Era' •We hosted our fifth WeNurses chat on 'Nursing and Ethics in the Genomic Era' •Ran a competition using the Genomics Game to conclude the week. •Four #GenomicsConversation podcasts were launched on SoundCloud to introduce nurses and midwives to genomics. •We held our first ever #GenomicsConversation Thunderclap to launch the weeks activities. •We organised a social media pledge campaign with enthusiasts spreading the message far and wide on social media. Engagement: •During the course of the week the website received over 12,000 page views. •Our first ever Thunderclap was a great success delivering a huge social reach with influential supporters from nursing including WeNurses, AgencyNurse and 6CsLive!. •Our four #GenomicsConversation podcasts were streamed over 80 times during week. •We received over 1000 views of our videos. |
Year(s) Of Engagement Activity | 2018 |
URL | https://www.genomicseducation.hee.nhs.uk/woa-18/ |