The Development Cell Atlas - Data Coordination Centre

Lead Research Organisation: European Bioinformatics Institute
Department Name: OMICs

Abstract

The Human Cell Atlas is developing a blueprint to better understand how cells function in normal life and in diseases; a foundational dataset which is part this initiative is the Developmental Cell Atlas. By understanding how cells function in normal tissues and how they change over time during development we provide an atlas which researchers can use to better understand how different cell types emerge in vivo and how things go wrong in diseased tissues. The data and knowledge generated can be used to design new drugs, to repurpose drugs and to detect diseases earlier as we can see when normal processes change before diseases can be diagnosed. In order to maximise the impact of the Developmental Cell Atlas the data generated should be collected, versioned and made available to the research community, both academic and industrial. This is a big data problem and needs tools that are able to collect and share genetic data and images, databases that can store these data and preserve it for years and tools to ensure that the results from the data can be used in different ways, such as drug development and basic academic research. This proposal provides the necessary infrastructure to collect, share and preserve the data, respecting the ethical constraints on the data and enabling the wide community of researchers to access the data. The standards, tools and services generated during this project will also be provided to the research community and portable and use cloud technologies meaning they can use the latest techniques for accessing data anywhere in the world.

Technical Summary

The MRC Developmental Cell Atlas Data Coordination Centre (DCC) will take a systematic data-driven approach in order to facilitate the collection and sharing of data generated by scientists as part of the MRC Developmental Cell Atlas (DCA) call. Deposition, validation and quality assurance of new data types in the scRNA-Seq and spatial transcriptomics is a complex process that requires researcher engagement and coordination. To facilitate this process the DCC will implement a community engagement strategy, including dedicated bioinformatics support. We will provide customised submission spreadsheets and JSON format APIs for manual and programmatic based submission ensuring that contributors are supported at every level. The DCC will work with the Developmental Atlas Community to evolve the standards and metadata schemas created by Human Cell Atlas-Data Coordination Platform (HCA-DCP) to capture information that includes the many facets of a cell's identity such as tissue source, developmental stage as well as experimental details. Data captured in a well-structured form that utilises ontologies maximises the downstream usefulness of the data and provides the ability to query and compute over the Human Cell Atlas and supports the Findable, Accessible, Interoperable and Reusable principles. Regular releases of the DCA data will be made and computational researchers will be supported by providing direct access to both primary and processed data (from grantees) through well-documented consumer APIs. Efficient access will ensure scientific analysis methods can keep pace with the transformative advances and rapid innovation in imaging and sequencing based methods. Establishing the DCA-DCC will maximise the utility of the Developmental Atlas Data and is critical to achieve the long-term vision to create a reference map of all human cells at different developmental stages that is both accessible and useful to the scientific and medical community.

Planned Impact

Recent transformative advances in experimental and computational methods are enabling high-throughput, quantitative and spatially resolved profiling of individual cells. The Developmental Cell Atlas will provide a foundational dataset that will enable us to understand how cells function in normal tissues and how they change over time during development in normal life and in disease. This will facilitate a deeper insight into human health and diagnosing, monitoring and treating disease and the Developmental Cell Atlas Data Coordination Platform (DCA DCC) will provide the necessary infrastructure to collect, share and preserve these data, respecting the ethical constraints.
Existing challenges in data science and reproducibility of biomedical data in the academic research context call for uniform metadata standards to ensure reproducibility and correct interpretation of resulting data. The DCC will deliver the infrastructure, standards and processes to address these challenges for the developmental atlas and critically will provide the wide community of researchers with a dataset that is interoperable with a much larger dataset than the Developmental Atlas alone. Freely available code and unique interoperable and accessible datasets with robust, semantically comparable data, that make data cleaning significantly easier and enable machine learning approaches, will benefit data scientist and analysts worldwide and provide a benchmark dataset that might be used for comparison different analytical approaches.
This dataset will enhance the research capacity, data, knowledge and skills beyond academic research, as data and tools with clear and permissive licenses are accessible and intended to be used by third parties from industry (big pharma, SMEs) in the biotech, AI, and data management sectors through existing databases such as ENA and via the Human Cell Atlas - Data Coordination Platform. The data and knowledge generated can be used by researchers in industry to design new drugs, to repurpose drugs and in early disease detection.
The DCA DCC will bring together support for novel data generation technologies, for example, imaging, bioinformatics and big data use of cloud technology and thereby maximise the impact of data among diverse academic communities and promote multidisciplinary research between these communities. By collaboration with the parallel Developmental Atlas training proposal all the resources will be shared more widely through the international 'Train Online' Platform (https://www.ebi.ac.uk/training/online/) and a webinar on the work of the DCA DCC will be shared via the Human Cell Atlas communications channels, via ELIXIR and via the EMBL-EBI train online platform.

DCC Beneficiaries:
- Grantees for the call will benefit as they will generate FAIR data stored in sustainable resources.
- Biological researchers will benefit as they will be able to access and query scRNA-Seq data and linked spatial transcriptomics data.
- Bioinformatics resource providers will benefit as they will be able to access the data and integrate this with their own resources. They will be also able to access code under a permissive licence.
- Publishers will be able to link consistently to the datasets supporting publications.
- Translational researchers in industry and academia will benefit from well annotated data with standard metadata, promoting sharing and re-use. This maximises value for research spend and researcher effort as well as promoting data sharing.
- Methods researchers will benefit as there will be a stable data releases of data with with to work and the data will conform to metadata standards.
- Scientists in training will benefit from access to stable, versioned data resources critical for training activities which can then be linked to the data releases.

Publications

10 25 50
 
Description (HCA Organoid) - HCA|Organoid: Pilot action to establish a multi-tissue human organoid platform within the Human Cell Atlas as a booster of future disease-centric, mechanistic, and translational research
Amount € 4,998,687 (EUR)
Funding ID 874769 
Organisation European Commission 
Sector Public
Country European Union (EU)
Start 01/2020 
End 03/2022
 
Description ORGND_Sequencing data integration
Amount $682,956 (USD)
Funding ID INV-035665 
Organisation Bill and Melinda Gates Foundation 
Sector Charity/Non Profit
Country United States
Start 12/2021 
End 11/2023
 
Title Human Cell Atlas Data Coordination Platform Data Release 
Description Data releases are issues on a monthly basis representing the full cohort of Human Cell Atlas data available from the data coordination platform. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact The HCA DCP data release contains data from the MRC development cell atlas; as of March 2022, the HCA DCP data release contains 16 projects and a total of 5.0M cells from embryo or development-associated tissues (e.g. placenta). This data is made openly accessible to researchers all over the globe and can be exploited for downstream research, including cell type characterisation 
URL https://data.humancellatlas.org/
 
Title The Human Cell Atlas Metadata Standards 
Description A set of JSON schema which defines metadata standards for development and adult human biological samples and cellular resolution assays such as single-cell RNAseq, single-cell ATAC-seq and spatial transcriptomics. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
Impact These standards are building on years of experience gained by EMBL-EBI archiving information about biological samples and transcriptomic data through databases such as BioSamples and ArrayExpress. The standards defined here will ensure the data collected by the MRC Development Cell Atlas are computationally readable and support the Findability, Accessibility, Interoperability and Reusability (FAIR) of the generated data. The FAIRness of the MRC Development Cell Atlas data will mean that maximal value can be derived from it both for the projects it was collected for and in future efforts which will be able to reuse the data. 
URL https://github.com/HumanCellAtlas/metadata-schema
 
Description Biological sample record flow from HDBR to BioSamples 
Organisation Newcastle University
Department Institute of Genetic Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution The EMBL-EBI team is working with HDBR representatives to deposit HDBR sample records in the BioSamples database. The goal is to establish a standard flow of sample records from HDBR to BioSamples. The EMBL-EBI team is harmonizing the sample metadata from HDBR with the HCA sample metadata standards. This work will ensure these records can be seamlessly integrated into HCA data contributions.
Collaborator Contribution The HDBR representatives are providing example sample metadata from their own records and working with the EMBL-EBI team to ensure our harmonization is accurate and representative of their own sample data.
Impact This collaboration will result in HDBR sample records deposited in BioSamples
Start Year 2019
 
Description Biological sample record flow from HDBR to BioSamples 
Organisation University College London
Department MRC/Wellcome Trust Human Developmental Biology Resource
Country United Kingdom 
Sector Private 
PI Contribution The EMBL-EBI team is working with HDBR representatives to deposit HDBR sample records in the BioSamples database. The goal is to establish a standard flow of sample records from HDBR to BioSamples. The EMBL-EBI team is harmonizing the sample metadata from HDBR with the HCA sample metadata standards. This work will ensure these records can be seamlessly integrated into HCA data contributions.
Collaborator Contribution The HDBR representatives are providing example sample metadata from their own records and working with the EMBL-EBI team to ensure our harmonization is accurate and representative of their own sample data.
Impact This collaboration will result in HDBR sample records deposited in BioSamples
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation Newcastle University
Department Institute of Genetic Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation The Wellcome Trust Sanger Institute
Department Human Genetics
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Cambridge
Department Cambridge Stem Cell Institute
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Cambridge
Department Department of Haematology
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Cambridge
Department Department of Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Cambridge
Department Department of Physiology, Development and Neuroscience
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Edinburgh
Department Institute of Genetics & Molecular Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Edinburgh
Department MRC Centre for Regenerative Medicine
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Manchester
Department Division of Diabetes, Endocrinology & Gastroenterology
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Manchester
Department School of Dentistry Manchester
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Oxford
Department Department of Paediatrics
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Oxford
Department Experimental Medicine Division
Country United Kingdom 
Sector Hospitals 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Development Cell Atlas - Data Coordination Centre 
Organisation University of Oxford
Department Kennedy Institute of Rheumatology
Country United Kingdom 
Sector Academic/University 
PI Contribution The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020.
Collaborator Contribution Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources.
Impact No data has been deposited yet but we are expecting this data to arrive during 2020.
Start Year 2019
 
Description The Human Cell Atlas Data Coordination Platform 
Organisation Broad Institute
Country United States 
Sector Charity/Non Profit 
PI Contribution The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas.
Collaborator Contribution The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas
Impact The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas.
Start Year 2017
 
Description The Human Cell Atlas Data Coordination Platform 
Organisation Chan Zuckerberg Initiative
Country United States 
Sector Private 
PI Contribution The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas.
Collaborator Contribution The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas
Impact The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas.
Start Year 2017
 
Description The Human Cell Atlas Data Coordination Platform 
Organisation Stanford University
Country United States 
Sector Academic/University 
PI Contribution The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas.
Collaborator Contribution The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas
Impact The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas.
Start Year 2017
 
Description The Human Cell Atlas Data Coordination Platform 
Organisation University of California, Santa Cruz
Country United States 
Sector Academic/University 
PI Contribution The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas.
Collaborator Contribution The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas
Impact The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas.
Start Year 2017
 
Title The Human Cell Atlas Data Coordination Platform (HCA-DCP) Ingestion Service 
Description This software provides all the functionality to validate and write data to the HCA DCP. 
Type Of Technology Software 
Year Produced 2017 
Open Source License? Yes  
Impact The HCA DCP ingestion service provides a vital infrastructure that enables the MRC Development Cell Atlas to share any generated data via both the HCA DCP and the EMBL-EBI archives such as BioSamples and the European Nucleotide Archive. This functionality ensures the community can access and use all the generated data to build the development atlas specifically and all parts of the Human Cell Atlas. This software has and will continue to be updated during the operation of the MRC Development Cell Atlas to ensure the DCP can capture the biological samples and experimental assays conducted by the researchers who are part of the MRC Development Cell Atlas. 
URL https://contribute.data.humancellatlas.org/
 
Description Face to Face Meeting with MRC Dev Atlas Awardees - Edinburgh 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact This meeting presented the overall Human Cell Atlas Data Coordination Platform to the MRC Development Cell Atlas lab in Edinburgh and gave them an overview of the data contribution process. This visit allowed the EMBL-EBI team to gain a greater understanding of how to support the Edinburgh lab and ensure the data they are generating can be supported by the Data Coordination Platform.
Year(s) Of Engagement Activity 2019
 
Description Managing single cell transcriptomics data 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Members of the EMBL-EBI team delivered training planned and delivered the Data Management of Single Cell Data training course. They gave presentations and practical demonstrations to the course attendees. These were about data management best practice; how to contribute data to the Human Cell Atlas; and how to store and structure data to ensure it is Findable, Accessible, Interoperable and Reusable (FAIR). One part of the course focused on the use of ontologies to provide semantics to the metadata associated with an experiment. Another focused on spatially resolved transcriptomics and experimental design considerations.
Year(s) Of Engagement Activity 2019
URL https://www.ebi.ac.uk/training/events/2019/managing-single-cell-transcriptomics-data
 
Description Single cell RNA-seq analysis: From questions to clusters - Spatial Transcriptomics 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact A member of the EMBL-EBI team introduced the Human Cell Atlas Data Coordination Platform and gave a presentation about the Spatial Transcriptomics methods being used as part of the Human Cell Atlas project. This increased awareness of spatial transcriptomics methods, a newer experimental approach, with the course attendees and provided them with insight into how to store and process data of this type so they are better prepared when they encounter it in their own lab.
Year(s) Of Engagement Activity 2019
URL https://www.ebi.ac.uk/training/events/2019/single-cell-rna-seq-analysis-questions-clusters
 
Description The Human Cell Atlas Data Coordination Platform 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact The MRC Development Cell Atlas Data Wrangler attended the Norwich Single Cell Symposium and presented a poster about the Human Cell Atlas Data Coordination Platform and our efforts to ensure the data collected by the Development Cell Atlas and other Cell Atlasing initiatives is Findable, Accessible, Interoperable and Reusable. Many attendees of the symposium expressed interest in our efforts and how they could adopt the HCA metadata standards and apply them to their work.
Year(s) Of Engagement Activity 2019
URL https://www.earlham.ac.uk/single-cell-symposium-2019
 
Description The Human Cell Atlas Data Coordination Platform and Metadata Standards 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Members of the EMBL-EBI team delivered training planned and delivered the Data Management of single Cell Data training course. They gave presentations and practical demonstrations to the course attendees. These were about data management best practice; how to contribute data to the HumanCell Atlas; and how to store and structure data to ensure it is Findable, Accessible, Interoperable and Reusable (FAIR).
Year(s) Of Engagement Activity 2020
URL https://www.ebi.ac.uk/training/events/starting-single-cell-rna-seq-analysis-virtual/
 
Description The Human Cell Atlas Data Coordination Platform and Metadata Standards 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact Members of the EMBL-EBI team delivered training planned and delivered the Data Management of single Cell Data training course. They gave presentations and practical demonstrations to the course attendees. These were about data management best practice; how to contribute data to the HumanCell Atlas; and how to store and structure data to ensure it is Findable, Accessible, Interoperable and Reusable (FAIR).
Year(s) Of Engagement Activity 2021
URL https://www.ebi.ac.uk/training/events/single-cell-rna-seq-analysis-using-r-virtual/