BioStudies and the Image Data Resource: Expanding Imaging Datasets, Linkage, Metadata, and Value
Lead Research Organisation:
European Bioinformatics Institute
Department Name: OMICs
Abstract
Abstracts are not currently available in GtR for all funded research. This is normally because the abstract was not required at the time of proposal submission, but may be because it included sensitive information such as personal details.
Technical Summary
Much of the published research in the life sciences includes multidimensional, quantitative image data. These images are routinely used for quantitative measures of biological processes and structures that form the foundation of many of the results published in peer-reviewed life sciences journals. In almost all cases, however, images are presented in published articles in processed, compressed formats that do not accurately convey the quality and complexity of the original image data. The sheer size and heterogeneity of image data sets- multi-dimensional image stacks combined with experimental metadata and analytic results-- makes image data handling and publication extremely complex, and in practice, rarely achieved.
In this project we aim to build the submission pipeline for deposition of reference imaging data in BioStudies and then into IDR. This will grow the datasets that are publicly available in both BioStudies and IDR. We will do this by building a submission pipeline and updating the data submission templates and building metadata validators for use by submitters. This will ensure correct metadata submission and reduce the time spent curating submitted studies by IDR staff. We will also extend the value of data stored by adding links to several valuable resources and extending the metadata the IDR holds.
In this project we aim to build the submission pipeline for deposition of reference imaging data in BioStudies and then into IDR. This will grow the datasets that are publicly available in both BioStudies and IDR. We will do this by building a submission pipeline and updating the data submission templates and building metadata validators for use by submitters. This will ensure correct metadata submission and reduce the time spent curating submitted studies by IDR staff. We will also extend the value of data stored by adding links to several valuable resources and extending the metadata the IDR holds.
Planned Impact
There are several forms of impact from this project. The first will derive from the imaging datasets we make available in BioStudies and IDR. These datasets can be accessed through the interactive interfaces presented by the two resources, and thus meet two recent requirements for scientific data, that the datasets will be findable and accessible. Those reference datasets that are included in IDR will further be integrated with other datasets through curation and normalisation, thus starting to make them interoperable, and available via the IDR Jupiter resource and also downloadable by aspera, so they are reusable.
One of the aims of BioStudies is to catalyze the development of data standards in life sciences - data can be initially described using the lightweight structures offered by BioStudies, and then tighter requirements can be defined in an incremental fashion. The proposed project will serve as a proof of concept of this process.
In addition, this project will help support the movement that is emerging to make the publication of imaging data routine, and possibly in the future, mandatory for scientific publications. Currently, journals, funders and community scientists are debating this issue- we hope to energise this debate and provide both technical solutions and scientific examples and rationales for publishing imaging data routinely.
Finally, the datasets are all available for download from BioStudies or IDR, providing resources for the development of new tools of image processing and analysis. Moreover, from the IDR, the application stacks and the metadata databases are all available, which allows others to download and re-use IDR data and systems, and integrate their datasets and analytics.
One of the aims of BioStudies is to catalyze the development of data standards in life sciences - data can be initially described using the lightweight structures offered by BioStudies, and then tighter requirements can be defined in an incremental fashion. The proposed project will serve as a proof of concept of this process.
In addition, this project will help support the movement that is emerging to make the publication of imaging data routine, and possibly in the future, mandatory for scientific publications. Currently, journals, funders and community scientists are debating this issue- we hope to energise this debate and provide both technical solutions and scientific examples and rationales for publishing imaging data routinely.
Finally, the datasets are all available for download from BioStudies or IDR, providing resources for the development of new tools of image processing and analysis. Moreover, from the IDR, the application stacks and the metadata databases are all available, which allows others to download and re-use IDR data and systems, and integrate their datasets and analytics.
People |
ORCID iD |
Alvis Brazma (Principal Investigator) | |
Ugis Sarkans (Co-Investigator) |
Publications
Ellenberg, J.
(2018)
Article
in Nature Methods
Hartley M
(2022)
The BioImage Archive - Building a Home for Life-Sciences Microscopy Data.
in Journal of molecular biology
Sarkans U
(2021)
REMBI: Recommended Metadata for Biological Images-enabling reuse of microscopy data in biology.
in Nature methods
Description | This is an infrastructure building grant and nothing has been discovered in a normal sense of what "discovery" means. Obviously the ones who have come up with this interface do not realise that science infrastructure is not about discovery. However we have a growing number of users and submitters to the databases supported by this grant, which demonstrates the usefulness of what we are doing. |
Exploitation Route | A solid bioinformatics infrastructure for biological image archiving and sharing. |
Sectors | Education Healthcare |