PhenoImageShare - a phenotype image annotation, sharing and discovery platform

Lead Research Organisation: EMBL - European Bioinformatics Institute
Department Name: Microarray Group

Abstract

Scientists have now completed the sequencing of the genome of many organisms including man, mouse, chicken and other vertebrates. The next major challenge is to understand the function of the genes and other parts of the genome in the living organism in normal, abnormal and diseased conditions. The observed variation in the living organism is called phenotype and the grand challenge in biological science is to understand how the variation of the genome of an individual influences and changes the observed phenotype. This will provide critical information to allow scientists to decode the complex processes that can lead to abnormal development and disease.

There are many measures of phenotype, for example height or weight. More recently scientist are using 2D and 3D imaging techniques to allow different phenotypes to be observed including shape change, abnormal development (e.g. hole in the heart), but also at the level of the cellular arrangement and expression of genes. A challenge is to bring together all the pieces of information about phenotype across databases of many hundred of thousands of images so that scientists can look for patterns that can reveal common and related causes. This project will develop software and databases that will allow the disparate image databases to be integrated in terms of the phenotypes that have been observed for each image.

We will develop a central database and web-portal that will allow scientists to search all the contributing image archives to find images that relate to particular conditions and diseases. The tools will include web-based interfaces to generate a standardised version of the phenotype plus spatial or location annotation (rather like the flags on Google maps) that allow queries to relate to parts of the body. The software will allow individual laboratories or large consortia to publish their phenotype annotations in a way that is integrated with others. This federation of databases allows cross querying of large image archives without having to bring data together in one place which would be very difficult to fund and maintain given the data volumes that are now collected.

The PhenoImageShare resource will provide the means to integrate the phenotype images in many resources and allow scientists to be able to search and mine for associations across all the studies small and large that might be relevant. This will allow faster access to relevant data and minimise use of animals by reducing re-experimentation that can arise simply because data could not be found.

Technical Summary

Bio-imaging is key to observing and quantifying morphological and histological phenotype. There are major sets of image data capturing 3D and high-res histology images for adult and embryo phenotype. In general the phenotype resources may be searched individually, but there is no mechanism for integration, cross-query and analysis, especially with respect to human abnormality and disease phenotype. Furthermore the annotations will typically be traits found by manual scanning, secondary analysis for subtle variation especially at the cellular level remains rather difficult. Finally none of the data will be in the context of a spatio-temporal framework for spatial analysis and interoperability with atlas-based resources such as the Allen Brain Atlas and eMouseAtlas.

PhenoImageShare will provide a toolkit for complex image annotation, sharing, discovery and query from federated biological images supporting phenotype description. Feasibility will be demonstrated using high throughput phenotype images from KOMP2 and IMPC, histology images of adult tissues early post-natal (EUCOMMTOOLS) and embryo (EMAGE). Images will be a combination of 3D (OPT, uCT, uMR) and high-resolution histology sections. All tools will be applicable to other model systems. We will develop:

1. federated phenotype query capabilities across image archives;
2. standards for phenotype and spatial annotation with server technology for interoperability;
3. interfaces to annotate phenotype with standard ontologies and within image and atlas spatial markup;
4. a referencing protocol ("data-track") for spatial queries and visualisation with respect to atlas frameworks;
5. plug-in to an open image-archiving system (OME) enabling lab-publication of annotations to be integrated via the central DB.

The use-cases include demonstrating federated query of multiple image sources using the ontology-based annotation as well as direct spatial queries from a standard atlas framework such as eMouseAtlas.

Planned Impact

As reference genomes and large scale programmes to generate mutants and knock-outs are completed there has been matching effort to establish and codify phenotype with genomic coverage. In parallel Bio-imaging is emerging as a primary mechanism to observe and quantify morphological and histological phenotype in mouse embryos and adults. There are also other major collections of image data, for example from the Sanger Institute's Zebrafish phenotype screens and plant screens. Current phenotyping effort will deliver annotations held in independent databases associated with the primary data, which may be searched individually, but there is no mechanism for integration, cross-query and analysis, especially with respect to human abnormality and disease phenotypes. Furthermore the image annotations will be obvious traits by manual scanning but will not include or allow deeper investigation for more subtle variation especially at the cellular level. Finally current data will not be published in the context of a common spatio-temporal framework allowing more complex analysis and interoperability with other atlas-based resources such as the Allen Brain Atlas and eMouseAtlas gene expression databases. PhenoImageShare addresses the need to locate, share annotation, map to spatial references, provide semantic and spatial queries between images and map these to image and genomic/transcriptomic objects for co-query. Mouse images will be used within the project as these are an excellent set for prototyping and freely available, but the technology and toolkit are accessible to any phenotype images from any species. The ability to index and share annotation on existing images will be delivered early, with latter stages devoted to spatial queries for complex images. Technology will be delivered to the community at low granularity image tagging, while working towards integration with reference atlases. PhenoImageShare beneficiaries:
- Biological researchers: will be able to access and query phenotype (images underlying) in context of genome/transcriptome, and be able to make spatial queries between large image sets.
- Bioinformatics resource providers: benefit as they will be able to access the annotations on images from the annotation server, register their own images, access annotation and viewing tools for images without importing all the images.
- Image collection owners: will be able to provide their image annotation and location using a standard protocol realising low cost provision of access to public images. Image queries will be provided at a range of granularities, allowing collection owners to iterate and provide increasingly rich data as their images are annotated as projects progress.
- Publishers: will be able to provide image annotations related to papers.
- Translational researchers in industry and academia: the images in question support cellular phenotyping, complex phenotypes, many of the image subjects are models for disease research.
- Funders: images archived, well annotated, shared and re-used, maximising value for research spend and researcher effort as well as promoting data sharing.
- Image analysis experts: a searchable corpus of images on which to develop methodology.
- Scientists in training: seeing images in the context of a reference atlas is an excellent way to learn.
- Owners or funders of reference atlases: spatial queries vs. a reference promotes use of the reference set and the precision and granularity of image annotation, providing a sliding scale of tagging to precise spatial query.
- This project supports the three R's benefiting animal welfare, as images are shared, promoting use of existing animals and related data, not generation of new images.
- Pharma and SMEs both consume images from the research sector, and generate their own images, e.g. during cellular phenotyping. The resulting toolkit will promote access to research images, and allow sharing of internal images across sites.

Publications

10 25 50
publication icon
Adebayo S (2016) PhenoImageShare: an image annotation and query infrastructure. in Journal of biomedical semantics

 
Description In year 3 of this proposal we have generated an updated production service for accessing image annotations for federated (remote) images, thereby improving the access to image data for the biological research community. The service is used by an interface and the interface has been improved by discussion and feedback with users. We have used the service in another project (IMPC) to link to third party image data simply and quickly
Exploitation Route We have used PhenoImageShare services in a large collaborative project (mousephenotype.org) for third party integration. These components will be retained and extended as needed.
Sectors Agriculture, Food and Drink,Digital/Communication/Information Technologies (including Software),Healthcare,Pharmaceuticals and Medical Biotechnology

URL http://www.phenoimageshare.org/
 
Description Our findings have been used in the project's infrastructure and also in the IMPC's data integration infrastructure to bring in third party data. The schema for the image meta data has been shared with colleagues at EBI involved in developing a large image database and we have used these in a TRDF proposal where we will integrate semantic tools with the PhenoImageShare services to enable automation of annotation of images. The use of the services as well as the platform by the community represents an advance for the project and we will continue to use these in future data integration projects.
First Year Of Impact 2014
Sector Digital/Communication/Information Technologies (including Software)
Impact Types Societal,Economic

 
Title BioImageArchive 
Description A generic database of reference biomedical image data hosted at EMBL-EBI, UK. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact EMBL's European Bioinformatics Institute (EMBL-EBI) has a long track record in archiving electron and X-ray microscopy image data in EMPIAR. More recently, EMBL-EBI in collaboration with OME has built the Image Data Resource (IDR), an added-value database for light-microscopy bioimaging data. IDR is now splitting into Cell-IDR and Tissue-IDR to improve scalability and data accessibility. Users have also been able to submit images from other modalities, including light microscopy, to the EMBL-EBI BioStudies database, which holds data that do not fit in any of the structured-data archives at EMBL-EBI. BioStudies and IDR are already connected so that cell and tissue reference-image data in BioStudies can be easily imported into IDR for integration on the basis of genetic or drug perturbations and phenotype. Likewise data submitted to IDR (soon; Cell-IDR and Tissue-IDR) can be automatically archived in BioStudies. 
URL https://www.ebi.ac.uk/bioimage-archive/
 
Title PhenoImagShare 
Description A database for sharing, integrating and annotating image based phenotype data. 
Type Of Material Database/Collection of data 
Year Produced 2014 
Provided To Others? Yes  
Impact Improved annotation of public data. Improved querying of public data. Definition of specification for ontology tool components. 
URL http://www.phenoimageshare.org
 
Description BioImageArchive Development 
Organisation European Molecular Biology Laboratory
Department European Molecular Biology Laboratory Heidelberg
Country Germany 
Sector Academic/University 
PI Contribution Outcomes from the PhenoImageShare project including the schema, annotation and implementation have contributed to the design of a species and domain neutral image archive under development (but not yet public) at EMBL-EBI
Collaborator Contribution Design of a species neutral image resource.
Impact Draft meta data standard.
Start Year 2018
 
Description EOSCLife 
Organisation ELIXIR
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution The ELIXIR Lead EOSCLife proposal has been informed by PhenoImageShare's approach to meta data standards, ontology management and indexing technology. The focus of the EOScLife proposal is to provide cloud accessible datasets and the approach to federated query, aggregate indexing and user requirements defined in PhenoImageShare has been maintained in this European proposal.
Collaborator Contribution Meta data standards, ontology standards and content, architecture design, data sharing.
Impact Strategy for meta data representation for image data Ontology services and annotation/mark up of image data Virtualisation strategy extended to cloud deployment
Start Year 2019
 
Description PhenoImageShare and Wellcome Trust Sanger Inst and ICS in France have collaborated to integrate brain histopathology data into IMPC using an IMPC deployment of PhenoImageShare technology 
Organisation Mouse Clinical Institute
Country France 
Sector Academic/University 
PI Contribution Explained the technology, deployed for new brain histopathology datasets at two partners and integrated the data using PhenoImageShare services into an existing IMPC query portal.
Collaborator Contribution Supplied data and annotation to perform data dissemination.
Impact Use and extension of the PhenoImageShare services
Start Year 2015
 
Description PhenoImageShare and Wellcome Trust Sanger Inst and ICS in France have collaborated to integrate brain histopathology data into IMPC using an IMPC deployment of PhenoImageShare technology 
Organisation The Wellcome Trust Sanger Institute
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Explained the technology, deployed for new brain histopathology datasets at two partners and integrated the data using PhenoImageShare services into an existing IMPC query portal.
Collaborator Contribution Supplied data and annotation to perform data dissemination.
Impact Use and extension of the PhenoImageShare services
Start Year 2015
 
Title PhenoImageShare 
Description Query service and schema for representing and query of phenotype-image data 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact PhenoImageShare (PhiS) is an online cross-species, cross-repository tool enabling semantic discovery, browsing and complex annotations of phenotype images. PhIS provides web tools that support the discovery, search and sharing of these annotations and associated metadata and genotype information. The tools also allow for submission or registration of new image instances and their annotations. PhenoImageShare also provides services which can be deployed in other platforms and these provide sustainability for the project. 
URL http://www.phenoimageshare.org
 
Description Connected Data London 2016 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact The conference was focused on Neo4j use cases around the world. PhIS also uses Neo4j as a data store. The aim was to find out how other people use it, common problems and solutions, scalability and performance.
Year(s) Of Engagement Activity 2016
URL http://connected-data.london/
 
Description Dissemination and demonstration of PhenoImageShare GUI at ISMB PhenoDay Special Interest group 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A presentation and live demonstration of the PhenoImageShare platform resulted in engagement with the audience and follow up from three representatives of the user community who offered requirements for new features for the project. A conference paper was accepted, and a full journal paper has been submitted.
http://phenoday2015.bio-lark.org/PhenotypeDay2015.pdf
Year(s) Of Engagement Activity 2015
URL http://phenoday2015.bio-lark.org/PhenotypeDay2015.pdf
 
Description Dissemination of PhenoImageShare to EUCOMMTools Project 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A presentation on PhenoImageShare to the EUCOMMTools project of the EC. Project infrastructure and capability presented and feedback gained on software implementation.
Year(s) Of Engagement Activity 2015
 
Description EBI Service Day Activities 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Professional Practitioners
Results and Impact EBI wide discussion of services and tools at which PhenoImageShare was presented.
Year(s) Of Engagement Activity 2015
 
Description Face-to-Face meeting with project collaborators - Edinburgh 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Planning for the last QT, meeting the new developer on the project, evaluate the interface and prioritize tasks.
Year(s) Of Engagement Activity 2016
 
Description Segmentation Workshop organised by PDBe 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact Presentation of the query service and GUI to light microscopy experts.
Year(s) Of Engagement Activity 2015
 
Description Wellcome Trust Scientific Conference Mouse Models of Disease: Using pathology techniques to enhance phenotyping outcomes. 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Professional Practitioners
Results and Impact A conference at which technologies for data generation in mouse phenotyping was discussed was selected to obtain user engagement to determine requirements for the software.
Year(s) Of Engagement Activity 2014