Bioinformatics for mouse genomics and genetics

Lead Research Organisation: MRC Mammalian Genetics Unit

Abstract

The mouse is widely used as a model for human disease. Our aim is to provide computational tools and analytical results that help us to better understand the functions of mouse genes and, threfore, their human couterparts. We have two main areas of interest. The first of these is the collection, storage and analysis of mouse phenotype data. Projects to systematically characterise the phenotypes of mice that have had single genes disabled have started and are likely to become increasingly important in the next few years. We have developed mechanisms for collecting, storing and displaying this data and have a strong interest in the representation of phenotype data using ontologies and in its analysis. With the advent of new, high-throughput technoligies for DNA sequencing we have the opportunity to understand in much greater detail the mutations giving rise to individual phenotypic changes and their effects on protein function and gene regulation. We are developing software to facilitate this analysis in the context of the Unit’s research.

Technical Summary

The BioComputing Group research is in two key areas; mouse phenotyping and next generation sequencing. Both areas are critical to mouse genetics and interlink with research programmes at MRC Harwell and activities in wider consortia in the community in which MRC Harwell is intimately involved. The group's specific areas of interest include the development of tools and novel approaches for the annotation of mouse phenotypes, which underpins the development, utilisation and dissemination of disease models - a vital broader remit for MRC Harwell. This research includes developing phenotype databases (www.europhenome.org) for the acquisition and dissemination of mouse phenotyping information, an activity that is interlinked with our leading involvement in the coordination of European and International programmes in mouse phenotyping (www.mousephenotype.org). We are also leading the Data Coordination Centre (DCC) for the NIH KOMP2 program in collaboration with the WTSI and EBI. Our sequence analysis work focusses on the interpretation of Next-Generation Sequencing data as they relate to major projects going on in the MGU, with particular emphasis at present on SNP identification in homozygous and heterozygous ENU mutant lines. This is leading to the more rapid identification of point mutations involved in the phenotypes of these lines.

Publications

10 25 50
 
Description Coordination and Sustinability of Mouse Informatics Resources
Geographic Reach Multiple continents/international 
Policy Influence Type Citation in other policy documents
Impact CASIMIR is developing guidelines for database integration and data release at an international level.
 
Description EU FP6 CA
Amount £1,000,000 (GBP)
Organisation Sixth Framework Programme (FP6) 
Sector Public
Country European Union (EU)
Start 03/2007 
End 03/2010
 
Description NIH Common Fund
Amount $12,000,000 (USD)
Organisation National Institutes of Health (NIH) 
Sector Public
Country United States
Start 08/2016 
End 08/2021
 
Description NIH RFA-RM-10-012 Knockout Mouse Phenotyping Project Database (U54)
Amount £1,438,707 (GBP)
Organisation National Institutes of Health (NIH) 
Sector Public
Country United States
Start 08/2011 
End 07/2016
 
Title EuroPhenome 
Description Database of raw mouse phenotyping data from the EUMODIC consortium 
Type Of Material Biological samples 
Year Produced 2007 
Provided To Others? Yes  
Impact The database will likely form the basis for the International Mouse Phenotyping Consortium database 
URL http://www.europhenome.org
 
Title IMPC - 
Description Database of all phenotyping data from the International Mouse Phenotyping Consortium 
Type Of Material Database/Collection of Data/Biological Samples 
Year Produced 2013 
Provided To Others? Yes  
Impact Significant impact of mouse functional data being available in the public domain for researchers to identify models of human disease 
 
Title IMPRESS 
Description Database of Standard operating procedures from IMPC 
Type Of Material Database/Collection of Data/Biological Samples 
Provided To Others? No  
Impact The standards in this database are utilised by all centres in IMPC 
URL http://www.mousephenotype.org/impress
 
Title MouseBook 
Description A data portal providing access to data on mouse lines held at MRC Harwell 
Type Of Material Biological samples 
Year Produced 2008 
Provided To Others? Yes  
Impact The portal has had a significant effect on numbers of requests for mouse lines from FESA 
URL http://www.mousebook.org
 
Description 100K Genomes 
Organisation Genomics England
Country United Kingdom 
Sector Public 
PI Contribution My research within the IMPC project will be integrated with the 100K genomes project
Collaborator Contribution The partner at QMUL has developed PhenoDIGM for integratting mouse and human data.
Impact www.mousephenotype.org
Start Year 2015
 
Description Birmingham 
Organisation University of Birmingham
Department Centre for Computational Biology
Country United Kingdom 
Sector Academic/University 
PI Contribution The data from IMPC will be utilised to interrogate the clinical data from the birmingham
Collaborator Contribution Ontology development
Impact No outcomes yet
Start Year 2015
 
Description CASIMIR 
Organisation University of Cambridge
Department Department of Physiology, Development and Neuroscience
Country United Kingdom 
Sector Academic/University 
PI Contribution I have jointly coordinated this project
Collaborator Contribution CASIMIR has supported meetings furthering international coordination
Impact Publication in Nature on post-publication data release; policy on data release for large-scale phenotyping projects PMID: 19741686 PMID: 19649761 PMID: 19306394 PMID: 19112082
Start Year 2006
 
Description DECOVID 
Organisation Alan Turing Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution Using detailed, frequently updated health data in a secure database, providing up to date information about patient care during the COVID-19 pandemic. The data will be analysed to answer the most pressing clinical questions to support the COVID-19 emergency response and to improve the quality of patient care for the future. The contribution from my group is to support the data capture, data wrangling and export to the analysts.
Collaborator Contribution All details on www.decovid.org
Impact NA
Start Year 2020
 
Description DECOVID 
Organisation University College London
Country United Kingdom 
Sector Academic/University 
PI Contribution Using detailed, frequently updated health data in a secure database, providing up to date information about patient care during the COVID-19 pandemic. The data will be analysed to answer the most pressing clinical questions to support the COVID-19 emergency response and to improve the quality of patient care for the future. The contribution from my group is to support the data capture, data wrangling and export to the analysts.
Collaborator Contribution All details on www.decovid.org
Impact NA
Start Year 2020
 
Description DECOVID 
Organisation University of Birmingham
Country United Kingdom 
Sector Academic/University 
PI Contribution Using detailed, frequently updated health data in a secure database, providing up to date information about patient care during the COVID-19 pandemic. The data will be analysed to answer the most pressing clinical questions to support the COVID-19 emergency response and to improve the quality of patient care for the future. The contribution from my group is to support the data capture, data wrangling and export to the analysts.
Collaborator Contribution All details on www.decovid.org
Impact NA
Start Year 2020
 
Description EDON 
Organisation Alan Turing Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution The Fingerprint Analytics Working Group is composed of data scientists and is responsible for developing, validating and refining machine learning 'fingerprint' models that can identify prospective data patterns which are predictive of specific dementia-causing diseases.
Collaborator Contribution EDoN brings together global experts in data science, digital technology and neurodegeneration to collect and decode huge amounts of digital health data generously donated by people like you. We aim to develop digital data fingerprints that pick up the earliest changes in the brain in diseases like Alzheimer's and can be built into wearable technologies like watches or headbands. These fingerprints could transform research efforts today, helping scientists make faster breakthroughs in understanding the disease and testing potential new preventions and treatments.
Impact NA
Start Year 2019
 
Description EDON 
Organisation Alzheimer's Research UK
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution The Fingerprint Analytics Working Group is composed of data scientists and is responsible for developing, validating and refining machine learning 'fingerprint' models that can identify prospective data patterns which are predictive of specific dementia-causing diseases.
Collaborator Contribution EDoN brings together global experts in data science, digital technology and neurodegeneration to collect and decode huge amounts of digital health data generously donated by people like you. We aim to develop digital data fingerprints that pick up the earliest changes in the brain in diseases like Alzheimer's and can be built into wearable technologies like watches or headbands. These fingerprints could transform research efforts today, helping scientists make faster breakthroughs in understanding the disease and testing potential new preventions and treatments.
Impact NA
Start Year 2019
 
Description ENFIN 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution We have developed a mathematical model of core biochemistry of pancreatic beta cells. We have also constructed a database registry for systems biology databases
Collaborator Contribution Building connections with systems biology groups across Europe
Impact PMID: 17514510
Start Year 2006
 
Description Leicester university MSc Bioinformatics 
Organisation University of Leicester
Country United Kingdom 
Sector Academic/University 
PI Contribution We provided a final year research project which was carried out at Harwell to a MSc student on the Bioinformatics MSc course. The team and I supervised the student. The research project was mostly based on sequencing.
Collaborator Contribution The university provided the MSc student.
Impact The outcomes from this partnership is the successful completion of the MSc course. In addition two of the students went on to do a PhD whilst the third achieved a position at the Sanger in the BioStatistics lab.
Start Year 2011
 
Description MPI2 
Organisation EMBL European Bioinformatics Institute (EMBL - EBI)
Country United Kingdom 
Sector Academic/University 
PI Contribution The contribution made is to lead the development of the IMPC data coordination centre
Collaborator Contribution The WTSI contributes statistical analysis and EBI is the CDA
Impact NA
Start Year 2011
 
Description MPI2 
Organisation The Wellcome Trust Sanger Institute
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution The contribution made is to lead the development of the IMPC data coordination centre
Collaborator Contribution The WTSI contributes statistical analysis and EBI is the CDA
Impact NA
Start Year 2011
 
Description Novartis 
Organisation Novartis
Department Novartis Statistical Methodology Group
Country Global 
Sector Private 
PI Contribution This is a newly funded collaborative project between MRC Harwell and the Novartis/University of Oxford Big Data Institute to establish a world-leading research alliance that will improve drug development by making it more efficient and more targeted. The alliance will make use of anonymised data from approximately 5 million patients from the UK and international partner organisations, together with anonymised data captured from relevant Novartis clinical trials. Using the BDI's latest statistical machine learning technology and experience in data analysis, combined with Novartis' wealth of clinical expertise and clinical trial data, the alliance expects to predict how patients will respond to existing and new medicines. Furthermore, through the development of an innovative IT environment and AI technology, the alliance will work to identify patterns in data, often across multiple data sources and types (imaging, genomics, clinical and biological), which cannot be detected by humans alone. My team will contribute the data wrangling skills to develop the informatics environment and capture/integrate the data.
Collaborator Contribution Novartis will share the data and expertise and the BDI will contribute the statistical analysis team.
Impact No outcomes to date.
Start Year 2018
 
Description Oxford Big Data Institute 
Organisation University of Oxford
Department Big Data Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution Shared DPhil students
Collaborator Contribution access to data.
Impact Not to date.
Start Year 2017
 
Description Statistical Modelling and Machine Learning Laboratory - COVID 
Organisation Alan Turing Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution Support of data wrangling for data from JBC to the Lab
Collaborator Contribution https://www.turing.ac.uk/news/alan-turing-institute-and-royal-statistical-society-support-joint-biosecurity-centre-covid-19
Impact NA
Start Year 2020
 
Description Statistical Modelling and Machine Learning Laboratory - COVID 
Organisation Royal Statistical Society
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Support of data wrangling for data from JBC to the Lab
Collaborator Contribution https://www.turing.ac.uk/news/alan-turing-institute-and-royal-statistical-society-support-joint-biosecurity-centre-covid-19
Impact NA
Start Year 2020
 
Description UCL - Bone uCT analysis 
Organisation University College London
Department Institute of Child Health
Country United Kingdom 
Sector Academic/University 
PI Contribution Analysis of a large dataset of bone uCT data
Collaborator Contribution Contribution of the data and expert knowledge
Impact Identification of new areas of interest to the scientists on morphological changes in the bone structure
Start Year 2016
 
Title Anonymus 
Description Anonymus is a LIMS system to store and disseminate mouse husbandry, phenotyping, tissue archive and genotyping data. 
Type Of Technology Webtool/Application 
Year Produced 2006 
Impact Our main customer is the Prion Unit in London. They are using the system to manage their animal house and experiments. The main impact is they can now manage their animal house effectively and benefit from Harwells experience and support. 
 
Description Beamline B23 Roadshows 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Submitted a number of future proposals from participants for beamline time.

The synchrotron-based CD technique is now being used in both life science and material science research with applications in the medical and biomedical fields.
Year(s) Of Engagement Activity 2013,2014
URL http://www.diamond.ac.uk/Home/Events/2014/B23-Workshop-Series.html
 
Description BioScience Journal article on MLC services (Winter 2015 edition, p.34-35) 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A article.
Year(s) Of Engagement Activity 2015
URL http://issuu.com/distinctivepublishing/docs/bsj06
 
Description Careers Event at Downs School 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Talked to school kids about a career in science and they asked me questions.

Many children asked what qualifications are required for a career in science. Some asked who do they contact for work experience. Most had never heard of Bioinformatics therefore they all learnt something new.
Year(s) Of Engagement Activity 2014
 
Description Diamond Open Days (2 days) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact MRC Harwell was at Diamond Light Source's open days this weekend, 14-15 March 2015, talking about the work we do at Harwell Campus and it's impact, including Dr Mary Lyon's work on X-inactivation.
Year(s) Of Engagement Activity 2015
URL http://www.har.mrc.ac.uk/news-events/news-archive/diamond-light-source-open-days
 
Description IMPC banner advert on PLOS Genetics website 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A banner for the IMPC project was advertised on PLOS biology website
Year(s) Of Engagement Activity 2015
 
Description MRC Harwell Open Day 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact The bioinformatics department led an activity during the 2013 centenary centre open day. The feedback was very positive and initated alot of discussions ranging from imaging, sequencing, phenotyping and data management


The Open Day itself was highly successful, with numerous positive comments. Many people had a better perception of the work at Harwell.
Year(s) Of Engagement Activity 2014
 
Description MRC Mouse Network Meeting 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact A scientific community meeting for all scientists from the MRC Mouse Networks.
Year(s) Of Engagement Activity 2015
 
Description Mary Lyon's obituary published in the Guardian 
Form Of Engagement Activity A magazine, newsletter or online publication
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Public/other audiences
Results and Impact Mary Lyon's obituary published in the Guardian
Year(s) Of Engagement Activity 2015
 
Description Next Generation Sequencing Lecture to Postgraduate students 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact The lecture was part of Reading University Bioinformatics course cirriculum. The talk did encourage discussion on the mouse and/or human genome and sequencing. As part of the course, the students have to answer questions on the lecture in an exam type setting.

I've had good feedback from the course tutors and asked to return the following year to do the same Lecture.
Year(s) Of Engagement Activity 2013,2014
 
Description Oxfordshire Science Festival 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Participation in oxford science festival
Year(s) Of Engagement Activity 2015
URL http://www.har.mrc.ac.uk/news-events/news-archive/oxfordshire-science-festival-2015
 
Description Scientific presentation 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach National
Primary Audience Other academic audiences (collaborators, peers etc.)
Results and Impact Presentation on the teams NGS results and Harwells Ageing Screen.

Met potential collaborators and other scientists in the field of genomics, NGS and clinical domains.
Year(s) Of Engagement Activity 2013
 
Description Scientific presentation 
Form Of Engagement Activity Scientific meeting (conference/symposium etc.)
Part Of Official Scheme? Yes
Type Of Presentation paper presentation
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Talk on Ageing Screen and NGS results. Audience asked alot of questions on the teams results.

Met other scientists and potential collaborators in the field of NGS and genomics.
Year(s) Of Engagement Activity 2014
 
Description St Birinus School site visit 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact A talk on LIMS was given by the Anonymus team to the A Level Applied Science students.


Feedback positive, some interest in Harwell's technician apprentice scheme.
Year(s) Of Engagement Activity 2014
 
Description Supervising MSc Research projects (3 months each student) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach Regional
Primary Audience Postgraduate students
Results and Impact Students work at Harwell and produce a research thesis over a 3 month period.

2 students went onto do a PhD, all students learnt new and up todate techniques in Next Generation Sequencing Techniques. All students past their MSc with a high score.
Year(s) Of Engagement Activity 2011,2012,2013
 
Description Supervising SEPNet (South East Physics Network) students (8 wks ea student) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact Students undergo an eight week summer studentship to evaluate a variety of open source segmentation programs applied to MicroCT reconstructions of E14.5 mouse embryos. The 2013 student was awarded a best poster and presentation prize at the SEPNet conference later that year.

Both students learnt up to date techniques in mouse embryo imaging and was awarded an undergraduate degree with a good grade.
Year(s) Of Engagement Activity 2013,2014
 
Description Training CD beamline users on use of custom-built software 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? Yes
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Better understanding of beamline results

Heightened interest on carrying out data analysis using the beamline customised software
Year(s) Of Engagement Activity 2013,2014
URL http://confluence.diamond.ac.uk/display/B23Tech/CD+Apps+documentation
 
Description Tutoring undergraduates at the University of Reading 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? Yes
Geographic Reach National
Primary Audience Undergraduate students
Results and Impact Encouraged questions and discussions regarding lessons

Better understanding of modules
Year(s) Of Engagement Activity 2013,2014
 
Description Work Experience 
Form Of Engagement Activity Participation in an open day or visit at my research institution
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Schools
Results and Impact Here we have taught school children on the merits of a bioinformatics career and on the everyday life and/or description of a Bioinformatician. The children have been in groups from 2-6 and originate from many different schools. This excerise can occur at various times in the year and reaches many local schools such as Forest Sch Workingham, Fitzharry Abingdon, Blue Coat Reading, Kingsdown Swindon, Didcot Girls, Henry Box witney etc......

On many occassion the children are fastinated about our work, it stimulates thinking and gives them a brief insight into the life of a scientist.
Year(s) Of Engagement Activity 2011,2012,2013,2014