HUGS: a Hub for Uk Greenhouse gas data Science

Lead Research Organisation: University of Bristol
Department Name: Chemistry

Abstract

Atmospheric observations of greenhouse gas (GHG) concentrations can be used to estimate emissions when combined with models of atmospheric transport and an understanding of the emission sources surrounding the observations. These top-down methods are complementary to the bottom-up, accounting-based, approaches that are currently used to create national GHG inventories. To improve the transparency and accuracy of these inventories and better evaluate progress on emissions reduction policies, scientists and policy makers have been advocating for the integration of top-down methods into the emissions reporting process. The United Nations Framework Convention on Climate Change (UNFCCC) recently acknowledged the important role that emissions quantified through atmospheric observations could have in supporting inventory evaluation (UNFCCC, COP 23, SBSTA/2017/L.21). The UK GHG science community is leading the world in this regard, with a dedicated national monitoring network, a range of regional networks and regular over-passes by various satellites. Currently, the UK is one of only three countries on Earth to include top-down estimates in its National Inventory Report to the UNFCCC.

The process of inferring emissions from GHG observations is extremely data intensive. In order to understand the observed variability in GHG concentrations, scientists must combine data from diverse networks in different environments and using different instrumentation, understand the distribution of potential sources and land use types in the vicinity of the sensor and be able to accurately model the atmospheric processes that transport GHGs from sources to the measurement site. Therefore, to date, analysis of GHG data is largely carried out on a case-by-case basis for individual research papers.

Here, we propose that new developments in cloud computing are required to help GHG scientists overcome some of the major obstacles for the integration of GHG networks and the production of operational, higher resolution GHG flux estimates. We will create the cloud-based framework for a UK GHG data science "hub". This hub will allow users (GHG scientists and, eventually, the public) to:

- Improve the flow of information to and from GHG data providers, because cloud services are not behind institutional firewalls
- Operationalise the processing of datasets into common formats, which can then be made globally accessible to users (subject to any required usage restrictions)
- Automatically trigger operations on new data, such as the running of chemical transport models, which are essential for the interpretation of GHG data
- Analyse data, model output and ancillary information (maps of land use, emissions inventories, etc.) on the cloud, without the need for individual users to download datasets and run models (requiring technical expertise)
- Visualise data, models and other relevant information on a web-based platform

Our team is world leading in the measurement and analysis of GHGs, cloud computing and spatial mapping. This project will rely heavily on a cloud platform (built as part of the EPSRC-funded BioSimSpace project) and GHG analysis codebase that has already been developed by team members. These tools are built on top of standard tools such as Jupyter notebooks, distributed object stores, and serverless functions. It is this expertise and these open tools that will allow us to develop the framework for our data science hub that will be extensible by GHG researchers at the end of this project.

We envisage that such a hub could be at the centre of the UK's large and growing GHG science community, allowing scientists to upload, analyse and visualise their data on a single platform, enhancing data integration and sharing between groups. Ultimately, this platform could be extended to allow the public to interact with GHG data, letting them learn whether the UK's emissions reductions efforts are reflected in atmospheric observations.

Planned Impact

Our impact will target the following groups, with whom we already enjoy strong links:

1. UK inventory teams: Our work will directly benefit the government Department of Business, Energy and Industrial Strategy (BEIS) and Defra, who are responsible for delivering the GHG inventory under the UNFCCC and Kyoto agreements. We will build on our existing collaborations, for example with Ricardo Energy and Environment (contractors with overall responsibility for the national inventory), to ensure that new inventory developments are incorporated into our project. The impact will be improvements in monitoring progress towards climate goals, and ultimately better-informed decisions on how to reach those goals.
2. Next generation of greenhouse gas scientists.
3. The wider academic and industrial community who are evaluating use of the cloud.
4. Cloud providers who are working to adapt their offerings for the UK research community.
5. The public: The general public are increasingly engaged in climate issues and wish to better understand their country's impact on climate.

We will engage with these users through the following methods/activities:

1. We will present our hub at the UK National Inventory Steering Committee (NISC) annual meetings, which our team already attend. By working closely with inventory compilers, we will ensure that the UK inventory is closely integrated into our platform. By operationalising the process of comparing the inventory to atmospheric observations, we will provide more responsive, near-real-time updates to policy makers on any discrepancies with the inventory.
2. Post-graduate students studying GHG science experience a steep learning curve, and few are based in an environment where data and models are routinely combined to understand GHG data. Our hub will provide a platform where GHG data can be visualised and compared to models, mapping information and meteorology, to allow students to better understand the content of their data. Furthermore, by providing a central location for GHG data and models, we envisage improved collaboration between students working in this field across the country.
3. We will develop case studies from our use of the cloud which will include both the benefits and drawbacks of this delivery model of data storage and computing. This will include a comparison between running simulations using on-premise hardware and running the same calculation on the cloud. These case studies will be shared to the wider research, HPC and funder communities via our existing links to e.g. the EPSRC eInfrastructure SAT, RSE community, HPC-SIG etc.
4. We will work closely with cloud providers to help them map their service offerings to the needs of the UK research community. In particular, in this project we have partnered with Oracle to help develop and optimise the Fn service. This will include regular (fortnightly) meetings with Cloud engineers and Fn software developers, early access to the production Fn service and collaborative working to ensure this is optimised and optimally used to deploy HUGS. We will share this information with other cloud providers via our existing close links, and also to the wider cloud community via presentation at KubeCon 2019. The aim will be to demonstrate the power and flexibility of open source serverless frameworks, thereby encouraging the community to move away from proprietary platforms (e.g. AWS lambda or Azure functions).
5. Ultimately, we aim to make an interactive web interface to our hub that will be accessible by the public. Such a platform will increase the confidence of the public in the UK's GHG concentrations. Our team also has a track record for press engagement (e.g. recently featuring in BBC Radio 4's "Inside Science"), which we will continue throughout this project. Our impact here will be to make the issues understood by a wider audience, allowing them to be engaged in the national debate.
 
Description We have developed a cloud-based platform for greenhouse gas data analysis, and open source codebase.

Based on this platform, we have developed a greenhouse gas data viewer, which was showcased during COP26 in Glasgow: https://openghg.github.io/dashboard/#/
Exploitation Route The HUGS service will be online throughout 2020. The code will be publicly available for others to develop, if needed.
Sectors Environment

URL https://openghg.org
 
Description We developed a platform for showcasing GHG data, including explanatory information on how this data can be used to evaluate national emissions. The platform was rolled out at COP26 in Glasgow: https://openghg.github.io/dashboard/#/. An article was written describing the data collection and use in The Conversation: https://theconversation.com/countries-may-be-under-reporting-their-greenhouse-gas-emissions-thats-why-accurate-monitoring-is-crucial-171645
First Year Of Impact 2021
Sector Environment
Impact Types Policy & public services

 
Description UK Emission Measurement System
Amount £1,372,621 (GBP)
Funding ID NE/Y001761/1 
Organisation Natural Environment Research Council 
Sector Public
Country United Kingdom
Start 03/2023 
End 03/2025
 
Title hugs-cloud 
Description www.hugs-cloud.com is a cloud-based portal for greenhouse gas data analysis. 
Type Of Material Improvements to research infrastructure 
Year Produced 2020 
Provided To Others? No  
Impact N/A 
URL http://www.hugs-cloud.com
 
Description Advanced Global Atmospheric Gases Experiment (AGAGE) 
Organisation Commonwealth Scientific and Industrial Research Organisation
Country Australia 
Sector Public 
PI Contribution International programme to measure and model atmospheric trace gases
Collaborator Contribution Data provision. Model development.
Impact Several publications (e.g. Rigby et al., 2013; 2014).
Start Year 2008
 
Description Advanced Global Atmospheric Gases Experiment (AGAGE) 
Organisation Empa - Swiss Federal Laboratories for Materials Science and Technology
Country Switzerland 
Sector Academic/University 
PI Contribution International programme to measure and model atmospheric trace gases
Collaborator Contribution Data provision. Model development.
Impact Several publications (e.g. Rigby et al., 2013; 2014).
Start Year 2008
 
Description Advanced Global Atmospheric Gases Experiment (AGAGE) 
Organisation Massachusetts Institute of Technology
Country United States 
Sector Academic/University 
PI Contribution International programme to measure and model atmospheric trace gases
Collaborator Contribution Data provision. Model development.
Impact Several publications (e.g. Rigby et al., 2013; 2014).
Start Year 2008
 
Description Met Office 
Organisation Meteorological Office UK
Country United Kingdom 
Sector Academic/University 
PI Contribution Expertise in inverse methods
Collaborator Contribution Exertise in atmospheric modelling
Impact Publications.
Start Year 2012
 
Description National Oceanic and Atmospheric Administration (NOAA) Earth System Research Laboratory (ESRL) 
Organisation National Oceanic And Atmospheric Administration
Department Earth System Research Laboratory (ESRL)
Country United States 
Sector Public 
PI Contribution Modelling of greenhouse gases.
Collaborator Contribution Provision of greenhouse gas data and expertise
Impact Several publications have resulted from this collaboration.
Start Year 2012
 
Title ACRG-Bristol/acrg: ACRG v0.2.0 
Description ACRG standardisation and inversion code v0.2.0 Added Ability to convert calibration scale in get_obs New "defaults" file that specifies inlets and instruments to use for particular time periods An obs.db SQLite database that specifies the location of all obs files and basic details about their contents (species, inlet, time range, etc.) notebooks directory for Jupyter notebooks notebooks/tutorials directory for notebook based tutorials a tmp directory to store random job script output files added a dev environment that includes spyder and a lighter environment that does not Changed get_single_site now returns a list of xarray datasets, one for each combination of inlet and site. If defaults are specified, the list will contain the default instruments and inlets for each period get_obs now returns a dictionary containing lists of datasets calibration scale and inlet are now attributes to obs datasets (e.g. ds.attrs["scale"]) fp_data_merge now works with new get_obs object The flux function will now look for species-total.nc named files first and then look for species.nc files. This will not be able to read both files. This can still accept an more explicit source such as co2-ff_*.nc as an alternative to this. arviz package version pinned to prevent conflict with pymc3 version 
Type Of Technology Software 
Year Produced 2022 
Open Source License? Yes  
Impact This software is used to evaluate emissions of greenhouse gases for the UK and other countries. 
URL https://zenodo.org/record/6834888
 
Title HUGS greenhouse gas data analysis platform 
Description The HUGS platform provides researchers with tools to analyse greenhouse gas observations and related data (e.g. emissions inventories, atmospheric models). 
Type Of Technology Webtool/Application 
Year Produced 2020 
Open Source License? Yes  
Impact Too early to say 
URL http://www.hugs-cloud.com