HUGS: a Hub for Uk Greenhouse gas data Science

Lead Research Organisation: University of Bristol
Department Name: Chemistry


Atmospheric observations of greenhouse gas (GHG) concentrations can be used to estimate emissions when combined with models of atmospheric transport and an understanding of the emission sources surrounding the observations. These top-down methods are complementary to the bottom-up, accounting-based, approaches that are currently used to create national GHG inventories. To improve the transparency and accuracy of these inventories and better evaluate progress on emissions reduction policies, scientists and policy makers have been advocating for the integration of top-down methods into the emissions reporting process. The United Nations Framework Convention on Climate Change (UNFCCC) recently acknowledged the important role that emissions quantified through atmospheric observations could have in supporting inventory evaluation (UNFCCC, COP 23, SBSTA/2017/L.21). The UK GHG science community is leading the world in this regard, with a dedicated national monitoring network, a range of regional networks and regular over-passes by various satellites. Currently, the UK is one of only three countries on Earth to include top-down estimates in its National Inventory Report to the UNFCCC.

The process of inferring emissions from GHG observations is extremely data intensive. In order to understand the observed variability in GHG concentrations, scientists must combine data from diverse networks in different environments and using different instrumentation, understand the distribution of potential sources and land use types in the vicinity of the sensor and be able to accurately model the atmospheric processes that transport GHGs from sources to the measurement site. Therefore, to date, analysis of GHG data is largely carried out on a case-by-case basis for individual research papers.

Here, we propose that new developments in cloud computing are required to help GHG scientists overcome some of the major obstacles for the integration of GHG networks and the production of operational, higher resolution GHG flux estimates. We will create the cloud-based framework for a UK GHG data science "hub". This hub will allow users (GHG scientists and, eventually, the public) to:

- Improve the flow of information to and from GHG data providers, because cloud services are not behind institutional firewalls
- Operationalise the processing of datasets into common formats, which can then be made globally accessible to users (subject to any required usage restrictions)
- Automatically trigger operations on new data, such as the running of chemical transport models, which are essential for the interpretation of GHG data
- Analyse data, model output and ancillary information (maps of land use, emissions inventories, etc.) on the cloud, without the need for individual users to download datasets and run models (requiring technical expertise)
- Visualise data, models and other relevant information on a web-based platform

Our team is world leading in the measurement and analysis of GHGs, cloud computing and spatial mapping. This project will rely heavily on a cloud platform (built as part of the EPSRC-funded BioSimSpace project) and GHG analysis codebase that has already been developed by team members. These tools are built on top of standard tools such as Jupyter notebooks, distributed object stores, and serverless functions. It is this expertise and these open tools that will allow us to develop the framework for our data science hub that will be extensible by GHG researchers at the end of this project.

We envisage that such a hub could be at the centre of the UK's large and growing GHG science community, allowing scientists to upload, analyse and visualise their data on a single platform, enhancing data integration and sharing between groups. Ultimately, this platform could be extended to allow the public to interact with GHG data, letting them learn whether the UK's emissions reductions efforts are reflected in atmospheric observations.

Planned Impact

Our impact will target the following groups, with whom we already enjoy strong links:

1. UK inventory teams: Our work will directly benefit the government Department of Business, Energy and Industrial Strategy (BEIS) and Defra, who are responsible for delivering the GHG inventory under the UNFCCC and Kyoto agreements. We will build on our existing collaborations, for example with Ricardo Energy and Environment (contractors with overall responsibility for the national inventory), to ensure that new inventory developments are incorporated into our project. The impact will be improvements in monitoring progress towards climate goals, and ultimately better-informed decisions on how to reach those goals.
2. Next generation of greenhouse gas scientists.
3. The wider academic and industrial community who are evaluating use of the cloud.
4. Cloud providers who are working to adapt their offerings for the UK research community.
5. The public: The general public are increasingly engaged in climate issues and wish to better understand their country's impact on climate.

We will engage with these users through the following methods/activities:

1. We will present our hub at the UK National Inventory Steering Committee (NISC) annual meetings, which our team already attend. By working closely with inventory compilers, we will ensure that the UK inventory is closely integrated into our platform. By operationalising the process of comparing the inventory to atmospheric observations, we will provide more responsive, near-real-time updates to policy makers on any discrepancies with the inventory.
2. Post-graduate students studying GHG science experience a steep learning curve, and few are based in an environment where data and models are routinely combined to understand GHG data. Our hub will provide a platform where GHG data can be visualised and compared to models, mapping information and meteorology, to allow students to better understand the content of their data. Furthermore, by providing a central location for GHG data and models, we envisage improved collaboration between students working in this field across the country.
3. We will develop case studies from our use of the cloud which will include both the benefits and drawbacks of this delivery model of data storage and computing. This will include a comparison between running simulations using on-premise hardware and running the same calculation on the cloud. These case studies will be shared to the wider research, HPC and funder communities via our existing links to e.g. the EPSRC eInfrastructure SAT, RSE community, HPC-SIG etc.
4. We will work closely with cloud providers to help them map their service offerings to the needs of the UK research community. In particular, in this project we have partnered with Oracle to help develop and optimise the Fn service. This will include regular (fortnightly) meetings with Cloud engineers and Fn software developers, early access to the production Fn service and collaborative working to ensure this is optimised and optimally used to deploy HUGS. We will share this information with other cloud providers via our existing close links, and also to the wider cloud community via presentation at KubeCon 2019. The aim will be to demonstrate the power and flexibility of open source serverless frameworks, thereby encouraging the community to move away from proprietary platforms (e.g. AWS lambda or Azure functions).
5. Ultimately, we aim to make an interactive web interface to our hub that will be accessible by the public. Such a platform will increase the confidence of the public in the UK's GHG concentrations. Our team also has a track record for press engagement (e.g. recently featuring in BBC Radio 4's "Inside Science"), which we will continue throughout this project. Our impact here will be to make the issues understood by a wider audience, allowing them to be engaged in the national debate.


10 25 50