PhenoImageShare - a phenotype image annotation, sharing and discovery platform
Lead Research Organisation:
University of Edinburgh
Department Name: Sch of Molecular. Genetics & Pop Health
Abstract
Scientists have now completed the sequencing of the genome of many organisms including man, mouse, chicken and other vertebrates. The next major challenge is to understand the function of the genes and other parts of the genome in the living organism in normal, abnormal and diseased conditions. The observed variation in the living organism is called phenotype and the grand challenge in biological science is to understand how the variation of the genome of an individual influences and changes the observed phenotype. This will provide critical information to allow scientists to decode the complex processes that can lead to abnormal development and disease.
There are many measures of phenotype, for example height or weight. More recently scientist are using 2D and 3D imaging techniques to allow different phenotypes to be observed including shape change, abnormal development (e.g. hole in the heart), but also at the level of the cellular arrangement and expression of genes. A challenge is to bring together all the pieces of information about phenotype across databases of many hundred of thousands of images so that scientists can look for patterns that can reveal common and related causes. This project will develop software and databases that will allow the disparate image databases to be integrated in terms of the phenotypes that have been observed for each image.
We will develop a central database and web-portal that will allow scientists to search all the contributing image archives to find images that relate to particular conditions and diseases. The tools will include web-based interfaces to generate a standardised version of the phenotype plus spatial or location annotation (rather like the flags on Google maps) that allow queries to relate to parts of the body. The software will allow individual laboratories or large consortia to publish their phenotype annotations in a way that is integrated with others. This federation of databases allows cross querying of large image archives without having to bring data together in one place which would be very difficult to fund and maintain given the data volumes that are now collected.
The PhenoImageShare resource will provide the means to integrate the phenotype images in many resources and allow scientists to be able to search and mine for associations across all the studies small and large that might be relevant. This will allow faster access to relevant data and minimise use of animals by reducing re-experimentation that can arise simply because data could not be found.
There are many measures of phenotype, for example height or weight. More recently scientist are using 2D and 3D imaging techniques to allow different phenotypes to be observed including shape change, abnormal development (e.g. hole in the heart), but also at the level of the cellular arrangement and expression of genes. A challenge is to bring together all the pieces of information about phenotype across databases of many hundred of thousands of images so that scientists can look for patterns that can reveal common and related causes. This project will develop software and databases that will allow the disparate image databases to be integrated in terms of the phenotypes that have been observed for each image.
We will develop a central database and web-portal that will allow scientists to search all the contributing image archives to find images that relate to particular conditions and diseases. The tools will include web-based interfaces to generate a standardised version of the phenotype plus spatial or location annotation (rather like the flags on Google maps) that allow queries to relate to parts of the body. The software will allow individual laboratories or large consortia to publish their phenotype annotations in a way that is integrated with others. This federation of databases allows cross querying of large image archives without having to bring data together in one place which would be very difficult to fund and maintain given the data volumes that are now collected.
The PhenoImageShare resource will provide the means to integrate the phenotype images in many resources and allow scientists to be able to search and mine for associations across all the studies small and large that might be relevant. This will allow faster access to relevant data and minimise use of animals by reducing re-experimentation that can arise simply because data could not be found.
Technical Summary
Bio-imaging is key to observing and quantifying morphological and histological phenotype. There are major sets of image data capturing 3D and high-res histology images for adult and embryo phenotype. In general the phenotype resources may be searched individually, but there is no mechanism for integration, cross-query and analysis, especially with respect to human abnormality and disease phenotype. Furthermore the annotations will typically be traits found by manual scanning, secondary analysis for subtle variation especially at the cellular level remains rather difficult. Finally none of the data will be in the context of a spatio-temporal framework for spatial analysis and interoperability with atlas-based resources such as the Allen Brain Atlas and eMouseAtlas.
PhenoImageShare will provide a toolkit for complex image annotation, sharing, discovery and query from federated biological images supporting phenotype description. Feasibility will be demonstrated using high throughput phenotype images from KOMP2 and IMPC, histology images of adult tissues early post-natal (EUCOMMTOOLS) and embryo (EMAGE). Images will be a combination of 3D (OPT, uCT, uMR) and high-resolution histology sections. All tools will be applicable to other model systems. We will develop:
1. federated phenotype query capabilities across image archives;
2. standards for phenotype and spatial annotation with server technology for interoperability;
3. interfaces to annotate phenotype with standard ontologies and within image and atlas spatial markup;
4. a referencing protocol ("data-track") for spatial queries and visualisation with respect to atlas frameworks;
5. plug-in to an open image-archiving system (OME) enabling lab-publication of annotations to be integrated via the central DB.
The use-cases include demonstrating federated query of multiple image sources using the ontology-based annotation as well as direct spatial queries from a standard atlas framework such as eMouseAtlas.
PhenoImageShare will provide a toolkit for complex image annotation, sharing, discovery and query from federated biological images supporting phenotype description. Feasibility will be demonstrated using high throughput phenotype images from KOMP2 and IMPC, histology images of adult tissues early post-natal (EUCOMMTOOLS) and embryo (EMAGE). Images will be a combination of 3D (OPT, uCT, uMR) and high-resolution histology sections. All tools will be applicable to other model systems. We will develop:
1. federated phenotype query capabilities across image archives;
2. standards for phenotype and spatial annotation with server technology for interoperability;
3. interfaces to annotate phenotype with standard ontologies and within image and atlas spatial markup;
4. a referencing protocol ("data-track") for spatial queries and visualisation with respect to atlas frameworks;
5. plug-in to an open image-archiving system (OME) enabling lab-publication of annotations to be integrated via the central DB.
The use-cases include demonstrating federated query of multiple image sources using the ontology-based annotation as well as direct spatial queries from a standard atlas framework such as eMouseAtlas.
Planned Impact
As reference genomes and large scale programmes to generate mutants and knock-outs are completed there has been matching effort to establish and codify phenotype with genomic coverage. In parallel Bio-imaging is emerging as a primary mechanism to observe and quantify morphological and histological phenotype in mouse embryos and adults. There are also other major collections of image data, for example from the Sanger Institute's Zebrafish phenotype screens and plant screens. Current phenotyping effort will deliver annotations held in independent databases associated with the primary data, which may be searched individually, but there is no mechanism for integration, cross-query and analysis, especially with respect to human abnormality and disease phenotypes. Furthermore the image annotations will be "obvious" traits by manual scanning but will not include or allow deeper investigation for more subtle variation especially at the cellular level. Finally current data will not be published in the context of a common spatio-temporal framework allowing more complex analysis and interoperability with other atlas-based resources such as the Allen Brain Atlas and eMouseAtlas gene expression databases. PhenoImageShare addresses the need to locate, share annotation, map to spatial references, provide semantic and spatial queries between images and map these to image and genomic/transcriptomic objects for co-query. Mouse images will be used within the project as these are an excellent set for prototyping and freely available, but the technology and toolkit are accessible to any phenotype images from any species. The ability to index and share annotation on existing images will be delivered early, with latter stages devoted to spatial queries for complex images. Technology will be delivered to the community at low granularity - image tagging, while working towards integration with reference atlases. PhenoImageShare beneficiaries:
- Biological researchers: will be able to access and query phenotype (images underlying) in context of genome/transcriptome, and be able to make spatial queries between large image sets.
- Bioinformatics resource providers: benefit as they will be able to access the annotations on images from the annotation server, register their own images, access annotation and viewing tools for images without importing all the images.
- Image collection owners: will be able to provide their image annotation and location using a standard protocol realising low cost provision of access to public images. Image queries will be provided at a range of granularities, allowing collection owners to iterate and provide increasingly rich data as their images are annotated as projects progress.
- Publishers: will be able to provide image annotations related to papers.
- Translational researchers in industry and academia: the images in question support cellular phenotyping, complex phenotypes, many of the image subjects are models for disease research.
- Funders: images archived, well annotated, shared and re-used, maximising value for research spend and researcher effort as well as promoting data sharing.
- Image analysis experts: a searchable corpus of images on which to develop methodology.
- Scientists in training: seeing images in the context of a reference atlas is an excellent way to learn.
- Owners or funders of reference atlases: spatial queries vs. a reference promotes use of the reference set and the precision and granularity of image annotation, providing a sliding scale of tagging to precise spatial query.
- This project supports the three R's benefiting animal welfare, as images are shared, promoting use of existing animals and related data, not generation of new images.
- Pharma and SMEs both consume images from the research sector, and generate their own images, e.g. during cellular phenotyping. The resulting toolkit will promote access to research images, and allow sharing of internal images across sites.
- Biological researchers: will be able to access and query phenotype (images underlying) in context of genome/transcriptome, and be able to make spatial queries between large image sets.
- Bioinformatics resource providers: benefit as they will be able to access the annotations on images from the annotation server, register their own images, access annotation and viewing tools for images without importing all the images.
- Image collection owners: will be able to provide their image annotation and location using a standard protocol realising low cost provision of access to public images. Image queries will be provided at a range of granularities, allowing collection owners to iterate and provide increasingly rich data as their images are annotated as projects progress.
- Publishers: will be able to provide image annotations related to papers.
- Translational researchers in industry and academia: the images in question support cellular phenotyping, complex phenotypes, many of the image subjects are models for disease research.
- Funders: images archived, well annotated, shared and re-used, maximising value for research spend and researcher effort as well as promoting data sharing.
- Image analysis experts: a searchable corpus of images on which to develop methodology.
- Scientists in training: seeing images in the context of a reference atlas is an excellent way to learn.
- Owners or funders of reference atlases: spatial queries vs. a reference promotes use of the reference set and the precision and granularity of image annotation, providing a sliding scale of tagging to precise spatial query.
- This project supports the three R's benefiting animal welfare, as images are shared, promoting use of existing animals and related data, not generation of new images.
- Pharma and SMEs both consume images from the research sector, and generate their own images, e.g. during cellular phenotyping. The resulting toolkit will promote access to research images, and allow sharing of internal images across sites.
Organisations
- University of Edinburgh (Lead Research Organisation)
- University of Manchester (Collaboration)
- UNIVERSITY OF EXETER (Collaboration)
- UNIVERSITY OF EDINBURGH (Collaboration)
- UNIVERSITY OF OXFORD (Collaboration)
- Farr Institute of Health Informatics Research (Collaboration)
- EMBL European Bioinformatics Institute (EMBL - EBI) (Collaboration)
- Cardiff University (Collaboration)
- ELIXIR (Collaboration)
- BRUNEL UNIVERSITY LONDON (Collaboration)
- UNIVERSITY OF LIVERPOOL (Collaboration)
- Edinburgh Napier University (Collaboration)
- Medical Research Council (MRC) (Collaboration)
- UNIVERSITY OF DUNDEE (Collaboration)
- EARLHAM INSTITUTE (Collaboration)
People |
ORCID iD |
Richard Baldock (Principal Investigator) |
Publications
Hill B
(2015)
Constrained distance transforms for spatial atlas registration.
in BMC bioinformatics
Adebayo S
(2016)
PhenoImageShare: an image annotation and query infrastructure.
in Journal of biomedical semantics
Armit C
(2017)
The 'straight mouse': defining anatomical axes in 3D embryo models.
in Database : the journal of biological databases and curation
Description | In the current stage of the project we have implemented the core databases, SOLR indexes and the associated services to deliver a functioning web-interface (www.phenoimageshare.org) that allow data submission and data annotation. There are about 360.000 images already included in the DB with primary annotations and we are implementing the links through to OMERO, spatial reasoning using atlases. With respect to the latter, we developed the concept of the straight mouse which facilitates searching images based on spatial descriptions of anatomical regions in an image using biological direction and axes. A manuscript for this aspect has been accepted for publication and is in press. The project work has been presented at a number of workshops and conferences, and a journal paper has been published. |
Exploitation Route | The underlying standards for the database schema and the associated web-service end-points are likely to be taken through to other projects specifically the IMPC programme. |
Sectors | Digital/Communication/Information Technologies (including Software) Education Healthcare |
URL | http://www.phenoimageshare.org |
Title | PhenoImagShare |
Description | A database for sharing, integrating and annotating image based phenotype data. |
Type Of Material | Database/Collection of data |
Year Produced | 2014 |
Provided To Others? | Yes |
Impact | Improved annotation of public data. Improved querying of public data. Definition of specification for ontology tool components. |
URL | http://www.phenoimageshare.org |
Description | BioVis |
Organisation | Brunel University London |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | New Biological and Medical Visualisation community and resource centre |
Collaborator Contribution | Collaboration to develop the visualisation community and build partnerships |
Impact | New website set up and first community meeting organised |
Start Year | 2013 |
Description | BioVis |
Organisation | Earlham Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | New Biological and Medical Visualisation community and resource centre |
Collaborator Contribution | Collaboration to develop the visualisation community and build partnerships |
Impact | New website set up and first community meeting organised |
Start Year | 2013 |
Description | BioVis |
Organisation | Edinburgh Napier University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | New Biological and Medical Visualisation community and resource centre |
Collaborator Contribution | Collaboration to develop the visualisation community and build partnerships |
Impact | New website set up and first community meeting organised |
Start Year | 2013 |
Description | BioVis |
Organisation | University of Dundee |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | New Biological and Medical Visualisation community and resource centre |
Collaborator Contribution | Collaboration to develop the visualisation community and build partnerships |
Impact | New website set up and first community meeting organised |
Start Year | 2013 |
Description | BioVis |
Organisation | University of Edinburgh |
Department | The Roslin Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | New Biological and Medical Visualisation community and resource centre |
Collaborator Contribution | Collaboration to develop the visualisation community and build partnerships |
Impact | New website set up and first community meeting organised |
Start Year | 2013 |
Description | BioVis |
Organisation | University of Exeter |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | New Biological and Medical Visualisation community and resource centre |
Collaborator Contribution | Collaboration to develop the visualisation community and build partnerships |
Impact | New website set up and first community meeting organised |
Start Year | 2013 |
Description | Centre for Comparative Pathology |
Organisation | University of Edinburgh |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The Centre for Comparative Pathology is a new centre to develop specific skills and research tools for comparative analysis of normal and disease pathologies. It is a cross-Institute centre headed by Professor Mark Arends and Professor Mike Cheeseman and has allowed the funding of new slide-scanning microscopy at the HGU/IGMM, funding of a pathology service, web-site development and now has a Wellcome grant under review. I have contributed to all aspect of the development and extension of the CCP. |
Collaborator Contribution | The CCP is directed by Profs Arends and Cheeseman and theyir efforts have delivered the benefits of the new tools and the building of the web-preence. |
Impact | New equipment at the IGMM and now a proposal for extended funding of tools development submitted to the Wellcome Trust. |
Start Year | 2014 |
Description | DMDD |
Organisation | EMBL European Bioinformatics Institute (EMBL - EBI) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Deciphering Mechanisms of Developmental Disorders Two consortia, one funded by Wellcome Trust with supplement for equipment at the HGU the other still under review by MRC with funding for a post |
Collaborator Contribution | Large consortium to phenotype embryonic lethal knock-out strains of mice as part of the IMPC |
Impact | Funding acquired from Wellcome Trust Furher funding sought from MRC |
Start Year | 2012 |
Description | DMDD |
Organisation | Medical Research Council (MRC) |
Department | MRC National Institute for Medical Research (NIMR) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Deciphering Mechanisms of Developmental Disorders Two consortia, one funded by Wellcome Trust with supplement for equipment at the HGU the other still under review by MRC with funding for a post |
Collaborator Contribution | Large consortium to phenotype embryonic lethal knock-out strains of mice as part of the IMPC |
Impact | Funding acquired from Wellcome Trust Furher funding sought from MRC |
Start Year | 2012 |
Description | DMDD |
Organisation | Medical Research Council (MRC) |
Department | The Mary Lyon Centre |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Deciphering Mechanisms of Developmental Disorders Two consortia, one funded by Wellcome Trust with supplement for equipment at the HGU the other still under review by MRC with funding for a post |
Collaborator Contribution | Large consortium to phenotype embryonic lethal knock-out strains of mice as part of the IMPC |
Impact | Funding acquired from Wellcome Trust Furher funding sought from MRC |
Start Year | 2012 |
Description | DMDD |
Organisation | University of Oxford |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Deciphering Mechanisms of Developmental Disorders Two consortia, one funded by Wellcome Trust with supplement for equipment at the HGU the other still under review by MRC with funding for a post |
Collaborator Contribution | Large consortium to phenotype embryonic lethal knock-out strains of mice as part of the IMPC |
Impact | Funding acquired from Wellcome Trust Furher funding sought from MRC |
Start Year | 2012 |
Description | ELIXIR UK node |
Organisation | Cardiff University |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contribute to the development of training in the context of atlas-based resources and imaging |
Collaborator Contribution | Contribute to all other aspects of training materials for bioinformatics |
Impact | Development of the UK ELIXIR Node |
Start Year | 2013 |
Description | ELIXIR UK node |
Organisation | EMBL European Bioinformatics Institute (EMBL - EBI) |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contribute to the development of training in the context of atlas-based resources and imaging |
Collaborator Contribution | Contribute to all other aspects of training materials for bioinformatics |
Impact | Development of the UK ELIXIR Node |
Start Year | 2013 |
Description | ELIXIR UK node |
Organisation | Earlham Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contribute to the development of training in the context of atlas-based resources and imaging |
Collaborator Contribution | Contribute to all other aspects of training materials for bioinformatics |
Impact | Development of the UK ELIXIR Node |
Start Year | 2013 |
Description | ELIXIR UK node |
Organisation | University of Liverpool |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contribute to the development of training in the context of atlas-based resources and imaging |
Collaborator Contribution | Contribute to all other aspects of training materials for bioinformatics |
Impact | Development of the UK ELIXIR Node |
Start Year | 2013 |
Description | ELIXIR UK node |
Organisation | University of Manchester |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contribute to the development of training in the context of atlas-based resources and imaging |
Collaborator Contribution | Contribute to all other aspects of training materials for bioinformatics |
Impact | Development of the UK ELIXIR Node |
Start Year | 2013 |
Description | ELIXIR UK node |
Organisation | University of Oxford |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Contribute to the development of training in the context of atlas-based resources and imaging |
Collaborator Contribution | Contribute to all other aspects of training materials for bioinformatics |
Impact | Development of the UK ELIXIR Node |
Start Year | 2013 |
Description | ELIXIR-UK |
Organisation | ELIXIR |
Department | ELIXIR UK |
Country | United Kingdom |
Sector | Charity/Non Profit |
PI Contribution | Contributed to the establishment of the Expression Atlases strategic area for ELIXIR-UK. (Also involves Edinburgh University - lead of PhenoImageShare.) |
Collaborator Contribution | ELIXIR-UK leads and coordinates the UK activities in the context of the European ELIXIR programme. |
Impact | Interdisciplinary: biomedical informatics (biomedical research, computer science research) |
Start Year | 2016 |
Description | Functional Tissue Units |
Organisation | Farr Institute of Health Informatics Research |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | Collaboration with Dr Bernard de Bono on a new research project relating to tissue organisation at the cellular level termed "functional tissue units". |
Collaborator Contribution | Collaborative research and analysis of high-resolution tissue samples. |
Impact | Two publications |
Start Year | 2014 |
Title | Mouse Atlas GitHub repository |
Description | All the woolz image-processing software, applications and the IIP3D 3D tile-image servers re now available from the ma-tech github repository. |
Type Of Technology | Software |
Year Produced | 2016 |
Open Source License? | Yes |
Impact | The woolz technology is gradually being adopted to handel very-large 3D image voumes by a number of resources such as IMPC and DMDD. |
URL | https://github.com/ma-tech |
Description | BiVi Annual Meeting TGAC |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | Annual meeting of the BBSRC funded Biological Visualisation network held at The Genome Analysis Centre (TGAC). Gave invited talk. |
Year(s) Of Engagement Activity | 2015 |
Description | Conference presentation |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | 10-14 July 2014* Phenoday, ISMB 2015, Dublin part of the ISMB conference |
Year(s) Of Engagement Activity | 2014 |
Description | Medical Image Analysis Workshop |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | 27 May 2015* Dundee Medical Image Analysis workshop |
Year(s) Of Engagement Activity | 2015 |