The Development Cell Atlas - Data Coordination Centre
Lead Research Organisation:
EMBL - European Bioinformatics Institute
Department Name: OMICs
Abstract
The Human Cell Atlas is developing a blueprint to better understand how cells function in normal life and in diseases; a foundational dataset which is part this initiative is the Developmental Cell Atlas. By understanding how cells function in normal tissues and how they change over time during development we provide an atlas which researchers can use to better understand how different cell types emerge in vivo and how things go wrong in diseased tissues. The data and knowledge generated can be used to design new drugs, to repurpose drugs and to detect diseases earlier as we can see when normal processes change before diseases can be diagnosed. In order to maximise the impact of the Developmental Cell Atlas the data generated should be collected, versioned and made available to the research community, both academic and industrial. This is a big data problem and needs tools that are able to collect and share genetic data and images, databases that can store these data and preserve it for years and tools to ensure that the results from the data can be used in different ways, such as drug development and basic academic research. This proposal provides the necessary infrastructure to collect, share and preserve the data, respecting the ethical constraints on the data and enabling the wide community of researchers to access the data. The standards, tools and services generated during this project will also be provided to the research community and portable and use cloud technologies meaning they can use the latest techniques for accessing data anywhere in the world.
Technical Summary
The MRC Developmental Cell Atlas Data Coordination Centre (DCC) will take a systematic data-driven approach in order to facilitate the collection and sharing of data generated by scientists as part of the MRC Developmental Cell Atlas (DCA) call. Deposition, validation and quality assurance of new data types in the scRNA-Seq and spatial transcriptomics is a complex process that requires researcher engagement and coordination. To facilitate this process the DCC will implement a community engagement strategy, including dedicated bioinformatics support. We will provide customised submission spreadsheets and JSON format APIs for manual and programmatic based submission ensuring that contributors are supported at every level. The DCC will work with the Developmental Atlas Community to evolve the standards and metadata schemas created by Human Cell Atlas-Data Coordination Platform (HCA-DCP) to capture information that includes the many facets of a cell's identity such as tissue source, developmental stage as well as experimental details. Data captured in a well-structured form that utilises ontologies maximises the downstream usefulness of the data and provides the ability to query and compute over the Human Cell Atlas and supports the Findable, Accessible, Interoperable and Reusable principles. Regular releases of the DCA data will be made and computational researchers will be supported by providing direct access to both primary and processed data (from grantees) through well-documented consumer APIs. Efficient access will ensure scientific analysis methods can keep pace with the transformative advances and rapid innovation in imaging and sequencing based methods. Establishing the DCA-DCC will maximise the utility of the Developmental Atlas Data and is critical to achieve the long-term vision to create a reference map of all human cells at different developmental stages that is both accessible and useful to the scientific and medical community.
Planned Impact
Recent transformative advances in experimental and computational methods are enabling high-throughput, quantitative and spatially resolved profiling of individual cells. The Developmental Cell Atlas will provide a foundational dataset that will enable us to understand how cells function in normal tissues and how they change over time during development in normal life and in disease. This will facilitate a deeper insight into human health and diagnosing, monitoring and treating disease and the Developmental Cell Atlas Data Coordination Platform (DCA DCC) will provide the necessary infrastructure to collect, share and preserve these data, respecting the ethical constraints.
Existing challenges in data science and reproducibility of biomedical data in the academic research context call for uniform metadata standards to ensure reproducibility and correct interpretation of resulting data. The DCC will deliver the infrastructure, standards and processes to address these challenges for the developmental atlas and critically will provide the wide community of researchers with a dataset that is interoperable with a much larger dataset than the Developmental Atlas alone. Freely available code and unique interoperable and accessible datasets with robust, semantically comparable data, that make data cleaning significantly easier and enable machine learning approaches, will benefit data scientist and analysts worldwide and provide a benchmark dataset that might be used for comparison different analytical approaches.
This dataset will enhance the research capacity, data, knowledge and skills beyond academic research, as data and tools with clear and permissive licenses are accessible and intended to be used by third parties from industry (big pharma, SMEs) in the biotech, AI, and data management sectors through existing databases such as ENA and via the Human Cell Atlas - Data Coordination Platform. The data and knowledge generated can be used by researchers in industry to design new drugs, to repurpose drugs and in early disease detection.
The DCA DCC will bring together support for novel data generation technologies, for example, imaging, bioinformatics and big data use of cloud technology and thereby maximise the impact of data among diverse academic communities and promote multidisciplinary research between these communities. By collaboration with the parallel Developmental Atlas training proposal all the resources will be shared more widely through the international 'Train Online' Platform (https://www.ebi.ac.uk/training/online/) and a webinar on the work of the DCA DCC will be shared via the Human Cell Atlas communications channels, via ELIXIR and via the EMBL-EBI train online platform.
DCC Beneficiaries:
- Grantees for the call will benefit as they will generate FAIR data stored in sustainable resources.
- Biological researchers will benefit as they will be able to access and query scRNA-Seq data and linked spatial transcriptomics data.
- Bioinformatics resource providers will benefit as they will be able to access the data and integrate this with their own resources. They will be also able to access code under a permissive licence.
- Publishers will be able to link consistently to the datasets supporting publications.
- Translational researchers in industry and academia will benefit from well annotated data with standard metadata, promoting sharing and re-use. This maximises value for research spend and researcher effort as well as promoting data sharing.
- Methods researchers will benefit as there will be a stable data releases of data with with to work and the data will conform to metadata standards.
- Scientists in training will benefit from access to stable, versioned data resources critical for training activities which can then be linked to the data releases.
Existing challenges in data science and reproducibility of biomedical data in the academic research context call for uniform metadata standards to ensure reproducibility and correct interpretation of resulting data. The DCC will deliver the infrastructure, standards and processes to address these challenges for the developmental atlas and critically will provide the wide community of researchers with a dataset that is interoperable with a much larger dataset than the Developmental Atlas alone. Freely available code and unique interoperable and accessible datasets with robust, semantically comparable data, that make data cleaning significantly easier and enable machine learning approaches, will benefit data scientist and analysts worldwide and provide a benchmark dataset that might be used for comparison different analytical approaches.
This dataset will enhance the research capacity, data, knowledge and skills beyond academic research, as data and tools with clear and permissive licenses are accessible and intended to be used by third parties from industry (big pharma, SMEs) in the biotech, AI, and data management sectors through existing databases such as ENA and via the Human Cell Atlas - Data Coordination Platform. The data and knowledge generated can be used by researchers in industry to design new drugs, to repurpose drugs and in early disease detection.
The DCA DCC will bring together support for novel data generation technologies, for example, imaging, bioinformatics and big data use of cloud technology and thereby maximise the impact of data among diverse academic communities and promote multidisciplinary research between these communities. By collaboration with the parallel Developmental Atlas training proposal all the resources will be shared more widely through the international 'Train Online' Platform (https://www.ebi.ac.uk/training/online/) and a webinar on the work of the DCA DCC will be shared via the Human Cell Atlas communications channels, via ELIXIR and via the EMBL-EBI train online platform.
DCC Beneficiaries:
- Grantees for the call will benefit as they will generate FAIR data stored in sustainable resources.
- Biological researchers will benefit as they will be able to access and query scRNA-Seq data and linked spatial transcriptomics data.
- Bioinformatics resource providers will benefit as they will be able to access the data and integrate this with their own resources. They will be also able to access code under a permissive licence.
- Publishers will be able to link consistently to the datasets supporting publications.
- Translational researchers in industry and academia will benefit from well annotated data with standard metadata, promoting sharing and re-use. This maximises value for research spend and researcher effort as well as promoting data sharing.
- Methods researchers will benefit as there will be a stable data releases of data with with to work and the data will conform to metadata standards.
- Scientists in training will benefit from access to stable, versioned data resources critical for training activities which can then be linked to the data releases.
Organisations
- EMBL - European Bioinformatics Institute (Lead Research Organisation)
- UNIVERSITY OF OXFORD (Collaboration)
- UNIVERSITY OF EDINBURGH (Collaboration)
- University College London (Collaboration)
- Newcastle University (Collaboration)
- Stanford University (Collaboration)
- The Wellcome Trust Sanger Institute (Collaboration)
- UNIVERSITY OF MANCHESTER (Collaboration)
- University of California, Santa Cruz (Collaboration)
- UNIVERSITY OF CAMBRIDGE (Collaboration)
- Chan Zuckerberg Initiative (Collaboration)
- Broad Institute (Collaboration)
Description | (HCA Organoid) - HCA|Organoid: Pilot action to establish a multi-tissue human organoid platform within the Human Cell Atlas as a booster of future disease-centric, mechanistic, and translational research |
Amount | € 4,998,687 (EUR) |
Funding ID | 874769 |
Organisation | European Commission |
Sector | Public |
Country | European Union (EU) |
Start | 01/2020 |
End | 03/2022 |
Description | ORGND_Sequencing data integration |
Amount | $682,956 (USD) |
Funding ID | INV-035665 |
Organisation | Bill and Melinda Gates Foundation |
Sector | Charity/Non Profit |
Country | United States |
Start | 12/2021 |
End | 11/2023 |
Title | Human Cell Atlas Data Coordination Platform Data Release |
Description | Data releases are issues on a monthly basis representing the full cohort of Human Cell Atlas data available from the data coordination platform. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | The HCA DCP data release contains data from the MRC development cell atlas; as of March 2022, the HCA DCP data release contains 16 projects and a total of 5.0M cells from embryo or development-associated tissues (e.g. placenta). This data is made openly accessible to researchers all over the globe and can be exploited for downstream research, including cell type characterisation |
URL | https://data.humancellatlas.org/ |
Title | The Human Cell Atlas Metadata Standards |
Description | A set of JSON schema which defines metadata standards for development and adult human biological samples and cellular resolution assays such as single-cell RNAseq, single-cell ATAC-seq and spatial transcriptomics. |
Type Of Material | Database/Collection of data |
Year Produced | 2019 |
Provided To Others? | Yes |
Impact | These standards are building on years of experience gained by EMBL-EBI archiving information about biological samples and transcriptomic data through databases such as BioSamples and ArrayExpress. The standards defined here will ensure the data collected by the MRC Development Cell Atlas are computationally readable and support the Findability, Accessibility, Interoperability and Reusability (FAIR) of the generated data. The FAIRness of the MRC Development Cell Atlas data will mean that maximal value can be derived from it both for the projects it was collected for and in future efforts which will be able to reuse the data. |
URL | https://github.com/HumanCellAtlas/metadata-schema |
Description | Biological sample record flow from HDBR to BioSamples |
Organisation | Newcastle University |
Department | Institute of Genetic Medicine |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The EMBL-EBI team is working with HDBR representatives to deposit HDBR sample records in the BioSamples database. The goal is to establish a standard flow of sample records from HDBR to BioSamples. The EMBL-EBI team is harmonizing the sample metadata from HDBR with the HCA sample metadata standards. This work will ensure these records can be seamlessly integrated into HCA data contributions. |
Collaborator Contribution | The HDBR representatives are providing example sample metadata from their own records and working with the EMBL-EBI team to ensure our harmonization is accurate and representative of their own sample data. |
Impact | This collaboration will result in HDBR sample records deposited in BioSamples |
Start Year | 2019 |
Description | Biological sample record flow from HDBR to BioSamples |
Organisation | University College London |
Department | MRC/Wellcome Trust Human Developmental Biology Resource |
Country | United Kingdom |
Sector | Private |
PI Contribution | The EMBL-EBI team is working with HDBR representatives to deposit HDBR sample records in the BioSamples database. The goal is to establish a standard flow of sample records from HDBR to BioSamples. The EMBL-EBI team is harmonizing the sample metadata from HDBR with the HCA sample metadata standards. This work will ensure these records can be seamlessly integrated into HCA data contributions. |
Collaborator Contribution | The HDBR representatives are providing example sample metadata from their own records and working with the EMBL-EBI team to ensure our harmonization is accurate and representative of their own sample data. |
Impact | This collaboration will result in HDBR sample records deposited in BioSamples |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | Newcastle University |
Department | Institute of Genetic Medicine |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | The Wellcome Trust Sanger Institute |
Department | Human Genetics |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Cambridge |
Department | Cambridge Stem Cell Institute |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Cambridge |
Department | Department of Haematology |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Cambridge |
Department | Department of Medicine |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Cambridge |
Department | Department of Physiology, Development and Neuroscience |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Edinburgh |
Department | Institute of Genetics & Molecular Medicine |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Edinburgh |
Department | MRC Centre for Regenerative Medicine |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Manchester |
Department | Division of Diabetes, Endocrinology & Gastroenterology |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Manchester |
Department | School of Dentistry Manchester |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Oxford |
Department | Department of Paediatrics |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Oxford |
Department | Experimental Medicine Division |
Country | United Kingdom |
Sector | Hospitals |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Development Cell Atlas - Data Coordination Centre |
Organisation | University of Oxford |
Department | Kennedy Institute of Rheumatology |
Country | United Kingdom |
Sector | Academic/University |
PI Contribution | The main goal of our award is to support the deposition, validation and quality assurance of the data generated by awardees for the MRC Development Cell Atlas. A significant part of achieving this goal involves establishing working relationships with each of the awardees and gaining an understanding of the types of data they are generating. This helps us prepare to receive their data when it is ready and allows us to have any new standards or data deposition tools established. During 2019 we have been in contact with each awardee. This includes regular email communication with every project. We conducted a site visit to one of the University of Edinburgh teams to demonstrate data contribution processes and discuss their requirements. We also attend the monthly phone calls for the eye development atlas projects at the University of Edinburgh and Newcastle University. We are expecting all the projects to start delivering data in Quarter 2 of 2020 and the majority of data to be provided by the end of 2020. |
Collaborator Contribution | Each awardee has maintained contact with the data coordination centre during the first year of the Development Cell Atlas. As their data generation efforts finish, they will all deliver data to the Development Cell Atlas Coordinating centre during 2020. This data will be incorporated into the HCA Data Coordination Platform as well as deposited in the appropriate EMBL-EBI archival resources. |
Impact | No data has been deposited yet but we are expecting this data to arrive during 2020. |
Start Year | 2019 |
Description | The Human Cell Atlas Data Coordination Platform |
Organisation | Broad Institute |
Country | United States |
Sector | Charity/Non Profit |
PI Contribution | The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas. |
Collaborator Contribution | The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas |
Impact | The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas. |
Start Year | 2017 |
Description | The Human Cell Atlas Data Coordination Platform |
Organisation | Chan Zuckerberg Initiative |
Country | United States |
Sector | Private |
PI Contribution | The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas. |
Collaborator Contribution | The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas |
Impact | The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas. |
Start Year | 2017 |
Description | The Human Cell Atlas Data Coordination Platform |
Organisation | Stanford University |
Country | United States |
Sector | Academic/University |
PI Contribution | The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas. |
Collaborator Contribution | The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas |
Impact | The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas. |
Start Year | 2017 |
Description | The Human Cell Atlas Data Coordination Platform |
Organisation | University of California, Santa Cruz |
Country | United States |
Sector | Academic/University |
PI Contribution | The EMBL-EBI team lead the establishment of metadata standards and the provision of data ingestion services to the Human Cell Atlas Data Coordination Platform (HCA-DCP). Our role as the MRC Development Cell Atlas Data Coordination centre utilizes the existing services built as part of this collaboration and is extending and improving these services. Our work enables the Data Coordination Platform to share data generated for the MRC Development Cell Atlas. We have evolved the existing metadata standards, providing for specific information required for the Development Cell Atlas. This work ensures these data are maximally useful to build the Development Cell Atlas and for it to be integrated as part of the comprehensive Human Cell Atlas. |
Collaborator Contribution | The Data Coordination Platform (DCP) is built from components; each institution is responsible for different pieces. The Broad Institute is responsible for developing and operating our standard analysis pipelines which align the transcriptomic data to the genome and produce gene count matrixes for each sample. The University of California, Santa Cruz is responsible for operating the data storage for the DCP. They also design and manage the data portal (https://data.humancellatlas.org/), which provides the community tools to explore and download the HCA data. Stanford University is responsible for shepherding data releases to give the HCA community harmonised and consistently analysed data collections to use in their atlas building efforts. The Chan Zuckerberg Initiative is building services to provide access to the gene count matrix files from the DCP and to support querying the DCP to find specific subsets of the data. These components, together with the metadata standards and data ingestion services built by EMBL-EBI, give the HCA community a data platform. This platform facilitates the community sharing, exploring and analysing all contributed data to move toward the ultimate goal of the Human Cell Atlas |
Impact | The major outputs from this collaboration are the software components being used to run the Data Coordination Platform. The software is all held under the Human Cell Atlas GitHub organisation https://github.com/HumanCellAtlas. |
Start Year | 2017 |
Title | The Human Cell Atlas Data Coordination Platform (HCA-DCP) Ingestion Service |
Description | This software provides all the functionality to validate and write data to the HCA DCP. |
Type Of Technology | Software |
Year Produced | 2017 |
Open Source License? | Yes |
Impact | The HCA DCP ingestion service provides a vital infrastructure that enables the MRC Development Cell Atlas to share any generated data via both the HCA DCP and the EMBL-EBI archives such as BioSamples and the European Nucleotide Archive. This functionality ensures the community can access and use all the generated data to build the development atlas specifically and all parts of the Human Cell Atlas. This software has and will continue to be updated during the operation of the MRC Development Cell Atlas to ensure the DCP can capture the biological samples and experimental assays conducted by the researchers who are part of the MRC Development Cell Atlas. |
URL | https://contribute.data.humancellatlas.org/ |
Description | Face to Face Meeting with MRC Dev Atlas Awardees - Edinburgh |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Professional Practitioners |
Results and Impact | This meeting presented the overall Human Cell Atlas Data Coordination Platform to the MRC Development Cell Atlas lab in Edinburgh and gave them an overview of the data contribution process. This visit allowed the EMBL-EBI team to gain a greater understanding of how to support the Edinburgh lab and ensure the data they are generating can be supported by the Data Coordination Platform. |
Year(s) Of Engagement Activity | 2019 |
Description | Managing single cell transcriptomics data |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Members of the EMBL-EBI team delivered training planned and delivered the Data Management of Single Cell Data training course. They gave presentations and practical demonstrations to the course attendees. These were about data management best practice; how to contribute data to the Human Cell Atlas; and how to store and structure data to ensure it is Findable, Accessible, Interoperable and Reusable (FAIR). One part of the course focused on the use of ontologies to provide semantics to the metadata associated with an experiment. Another focused on spatially resolved transcriptomics and experimental design considerations. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.ebi.ac.uk/training/events/2019/managing-single-cell-transcriptomics-data |
Description | Single cell RNA-seq analysis: From questions to clusters - Spatial Transcriptomics |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | A member of the EMBL-EBI team introduced the Human Cell Atlas Data Coordination Platform and gave a presentation about the Spatial Transcriptomics methods being used as part of the Human Cell Atlas project. This increased awareness of spatial transcriptomics methods, a newer experimental approach, with the course attendees and provided them with insight into how to store and process data of this type so they are better prepared when they encounter it in their own lab. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.ebi.ac.uk/training/events/2019/single-cell-rna-seq-analysis-questions-clusters |
Description | The Human Cell Atlas Data Coordination Platform |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | National |
Primary Audience | Postgraduate students |
Results and Impact | The MRC Development Cell Atlas Data Wrangler attended the Norwich Single Cell Symposium and presented a poster about the Human Cell Atlas Data Coordination Platform and our efforts to ensure the data collected by the Development Cell Atlas and other Cell Atlasing initiatives is Findable, Accessible, Interoperable and Reusable. Many attendees of the symposium expressed interest in our efforts and how they could adopt the HCA metadata standards and apply them to their work. |
Year(s) Of Engagement Activity | 2019 |
URL | https://www.earlham.ac.uk/single-cell-symposium-2019 |
Description | The Human Cell Atlas Data Coordination Platform and Metadata Standards |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Members of the EMBL-EBI team delivered training planned and delivered the Data Management of single Cell Data training course. They gave presentations and practical demonstrations to the course attendees. These were about data management best practice; how to contribute data to the HumanCell Atlas; and how to store and structure data to ensure it is Findable, Accessible, Interoperable and Reusable (FAIR). |
Year(s) Of Engagement Activity | 2020 |
URL | https://www.ebi.ac.uk/training/events/starting-single-cell-rna-seq-analysis-virtual/ |
Description | The Human Cell Atlas Data Coordination Platform and Metadata Standards |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Postgraduate students |
Results and Impact | Members of the EMBL-EBI team delivered training planned and delivered the Data Management of single Cell Data training course. They gave presentations and practical demonstrations to the course attendees. These were about data management best practice; how to contribute data to the HumanCell Atlas; and how to store and structure data to ensure it is Findable, Accessible, Interoperable and Reusable (FAIR). |
Year(s) Of Engagement Activity | 2021 |
URL | https://www.ebi.ac.uk/training/events/single-cell-rna-seq-analysis-using-r-virtual/ |