Research Data Facility As A Service

Lead Research Organisation: University of Edinburgh
Department Name: Edinburgh Parallel Computing Centre

Abstract

We propose to provide a replacement to the Research Data Facility (RDF), operated at EPCC on behalf of EPSRC, as a service from May 2020 for a period of 12 months. The original RDF was first purchased in 2011 and expanded in 2013. Its oldest parts are over 8 years old and no longer maintainable. To reduce risk to users' data and to provide a low-cost solution to a complex data management challenge, EPCC proposes to operate the new RDF as a Service (RDFaaS), providing the upfront capital funding ourselves. EPSRC data will be copied from the old to the new system as part of the project. EPCC proposes to operate a one-year service providing 5 Petabytes of storage to the EPSRC ARCHER 2 and link Tier 2 scientific communities in the first instance.

The service itself will have exactly the same goals as the original service. Namely to:

1. Provide a secure, well-managed, robust, high-performance research data repository for EPSRC National HPC service users from across the UK;
2. Support collaborative working within consortia and with external collaborators including international collaborators; and
3. Ensure data can be copied on and off the service simply both to the National HPC service and to the users' home institutions as required.

Planned Impact

The RDFaaS e-Infrastructure project will continue to deliver the strong impact of the initial RDF service through:

1. Knowledge impacts: providing a long-term data infrastructure for all of the EPSRC scientific community who use the National HPC service, enabling the user community to deliver world-class science on the new ARCHER 2 system.
2. Economic impacts: ensuring that the research performed by the EPSRC scientific community continues to contribute to the long-term economic well-being of the country and supporting collaborative projects with industry.
3. Societal impacts: developing and improving the multitude of items we use in our day to day lives and contributing to our understanding of global challenges such as climate change and sustainability.
4. People and skills impacts: ensuring the scientific community have the appropriate skills and training to manage and curate the scientific data they produce on the National HPC service.

Fundamentally the RDFaaS is an enabling data infrastructure, supporting the EPSRC user community in their science and innovation goals.

Publications

10 25 50