G8 Multilateral Research Funding - ExArch

Lead Research Organisation: STFC - Laboratories
Department Name: RAL Space


Climate science demands on data management are growing rapidly as climate models grow in the precision with which they depict spatial structures and in the completeness with which they describe a vast range of physical processes.
For the Climate Model Inter-comparison Project 5 (CMIP5), a distributed archive is being constructed to provide
access to what is expected to be in excess of 10 Peta-bytes of global climate change projections. The data will be held at 30 or more computing centres and data archives around the world, but for users it will appear as a single archive described by one catalogue. In addition, the usability of the data will be enhanced by a three-step
validation process and the publication of Digital Object Identifiers (doi) for all the data.
For many users the spatial resolution provided by the global climate models (around 150km) is inadequate: the
CORDEX project will provide data scaled down to around 10km. Evaluation of climate impacts often revolves
around extremes and complex impact factors, requiring high volumes of data to be stored. At the same time,
uncertainty about the optimal configuration of the models imposes the requirement that each scenario be explored with multiple models. This project will explore the challenges of developing a software management infrastructure which will scale to the multi-exabyte archives of climate data which are likely to be crucial to major policy decisions in by the end of the decade. Support for automated processing of the archived data and metadata will be essential. In the short term goal, strategies will be evaluated by applying them to the CORDEX project data.

Planned Impact

The project will develop software for access to high impact climate archives.
Description This is a collaborative project, with partners from 6 countries. The non-UK partners are funded by their own research agencies. NERC has funded overall project leadership, development of a processing service, and of a quality control system.
The processing service provides a range of tools for regridding and computing a broad range of statistical quantities.
The quality control tool provides improved flexibility and reliability. An emerging requirement, not foreseen in the proposal, was to test consistency of the tables used to specify data format requirements. Specifications had been largely checked by proof reading, and this approach failed to deal with the subtle interdependencies in the syntax of the requirements. The new software for verifying consistency of requirements has provided support for 3 major international collaborative projects and is likely to form the basis of improved preparation of specifications for the CMIP6 archive.
Exploitation Route Web Processing Services developed in the project are now providing a range of services giving added value to the CEDA archive
Sectors Digital/Communication/Information Technologies (including Software)

URL http://proj.badc.rl.ac.uk/exarch/
Description Findings are being used in preparation for the CMIP6 archive of climate model projections, and in providing CMIP5 data through subcontract to a commercial user. Partners in Germany (DKRZ) developed a web processing framework within the project which is now deployed operationally in Germany and is being considered for use in our data centre. Partners in France (IPSL) developed software for management of climate model documentation which is being deployed for documentation of UK climate models.
First Year Of Impact 2014
Sector Digital/Communication/Information Technologies (including Software),Energy
Impact Types Societal

Title CDB Query 
Description Python code to manage the analysis of climate model outputs published in the CMIP5 and CORDEX archives This package provides simple tools to process data from the CMIP5 and CORDEX archives distributed by the Earth System Grid Federation. 
Type Of Technology Software 
Year Produced 2014 
Open Source License? Yes  
Impact The product has been taken up and further developed in a US NSF grant, 
URL https://pypi.python.org/pypi/cdb_query/1.5