Methodologically Enhanced Virtual Labs for Early Warning of Significant or Catastrophic Change in Ecosystems: Changepoints for a Changing Planet

Lead Research Organisation: Lancaster University
Department Name: Computing & Communications

Abstract

Virtual labs are emerging as a key component in the construction of future digital environments, particularly to abstract over the complexities of the underlying distributed networks of sensors and associated computational infrastructure. We define a virtual lab as a transdisciplinary collaboration space hosted in the cloud (public/private/hybrid) that allows stakeholders to access a range of data, analytical methods and assessment tools (e.g. visualisation tools and/or statistical tools), and to execute these analyses using the elastic capacity of a cloud. In the environmental science community, most existing virtual labs focus on the problem of integrating often complex and heterogeneous data. We seek to significantly advance the state-of-the-art by enhancing virtual labs with sophisticated methodological capability, embracing state-of-the-art data science techniques to assist in the societally-relevant interpretation of these data.

This is a bold and broad vision and, to make this feasible in a year, we elect to work with a particular family of data science techniques, that is, changepoint detection methods, designed to identify fundamental changes and anomalous behaviour in data, typically within time-series, but also applicable across space and time and to complex, multivariate problems.

This feasibility study will therefore bring together a cross-disciplinary team working on virtual labs, changepoint methods and evidence for impacts of global environmental change on ecosystem structure and function. Our approach will foster a deep, cross-disciplinary dialogue through workshops, enhanced by rapid prototyping of virtual labs to stimulate thinking about what is possible/desirable w.r.t. ecosystem early warning methods.

The project will build on the rich, complex, multi-faceted data available from the Environmental Change Network (ECN), that offers detailed multivariate 25-year long data sets for a range of ecosystems in the UK. We seek to understand the role of data science, including, but not limited to changepoint detection, in the construction of environmental early warning alert systems capable of operating at a variety of scales, from catchments to global planetary level systems.

Planned Impact

The proposed research is well balanced between cutting edge research and impact on stakeholders and the greater environmental community. Impact is very important to us, and in this document we present a multi-faceted Pathways to Impact strategy intended to deliver the following impact goals:

1. To influence future generations of virtual labs nationally and internationally in terms of embracing data science methodologies;
2. To place virtual labs and data science methodologies at the heart of future digital environment research and practice;
3. To demonstrate and evaluate the role of changepoints (in isolation or in combination with other data science techniques) in supporting environmental decision makers in offering early warning indicators of significant or catastrophic change, and hence enabling early interventions in terms of mitigation or adaptation strategies;
4. To build momentum and an associate community interested in constructing environmental early warning systems operating at different scales;
5. To contribute our considerable expertise and additional insights from this project into enhancing multidisciplinary and inter-disciplinary research and innovation (MIDRI).

We plan a series of mechanisms to ensure that we engage with our partners and beneficiaries throughout the project, with this engagement being fundamentally woven into the research methodology.

Firstly, we plan two workshops to be held in month 2 and month 12 respectively. The first workshop is internal, involving the investigators and associated research staff from Lancaster University and CEH, together with representatives of our partners. The second workshop is external and will involve all the constituents from the first workshop together with invitees from our identified beneficiaries including the Digital Environment community.

An agile methodology will be adopted in the project with virtual lab development being broken into a series of 2-month sprints with show and tell sessions organised at the end of each sprint. This approach is known to be highly effective in maximising partner engagement and ensuring that the final solutions are tailored careful to the needs of beneficiaries.

Our Centre of Excellence in Environmental Data Science (CEEDS) is an important vehicle to disseminate the results both externally, e.g. through our planned annual conference, and internally as we work closely with a range of environmental scientists in CEH and the Lancaster Environment Centre.

We believe that having a strong Digital Environment community is important and look forward to working with the champions and greater community through community meetings/events, and also offer to host a community meeting. There is also an important people dimension as well, as we train researchers to have the cross-disciplinary skills to contribute to Digital Environments.

Finally, we will adopt an open science policy within this project and all data, software and papers available as open assets. The project will also maintain a strong website presence coupled with a social media strategy.
 
Description The results fall into three categories:
Methodological: The development of novel algorithms/methods to determine potential changepoints when looking over a number of variables and associated time series. The techniques are also tolerant of misalignment over time.
Computational: We have extended our concept of virtual labs to support this and other methods (methodologically enhanced virtual labs) as a key step in supporting collaborative environmental data science.
Scientific: Our cross-disciplinary research is offering new scientific insights related to change across different environmental facets.
Exploitation Route We are feeding results into the UKRI SPF on Constructing a Digital Environment, including through the Expert Network associated with this programme (two of the investigators are involved in this network).

Our insights into methodologically enhanced virtual labs is influencing the Data Labs project - open source software supporting virtual labs, running on the JASMIN HPC facility - offering an enhanced service for the environmental sciences community.
Sectors Digital/Communication/Information Technologies (including Software),Environment

 
Description The insights into the use of virtual labs are feeding into the UKRI Constructing a Digital Environment programme, e.g. through the associated expert network. The insights into methodologically enhanced virtual labs are also being fed into NERC thinking around the future of JASMIN and data centres, e.g. through the associated Data Labs initiative. These insights are also influencing architectures for digital twins, esp. digital twins to support understanding and management of the natural environment.
First Year Of Impact 2021
Sector Digital/Communication/Information Technologies (including Software),Environment
Impact Types Societal,Policy & public services

 
Title Collaborative workshop to investigate the use of changepoints methods on ECN data 
Description We have produced a notebook and associated data sets as teh outcome of an internal project workshop to bring together a cross-disciplinary team to work together using virtual labs as a medium. 
Type Of Material Data analysis technique 
Year Produced 2020 
Provided To Others? Yes  
Impact Proof of concept of the effectiveness of virtual labs technology (specifically the DataLabs implementation to support collaborative and cross-disciplinary research. 
URL https://github.com/NERC-CEH/cptecn_DataLabs_workshop
 
Description DataLabs - Tales from the frontline of data science 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Seminar presented on DataLabs - Tales from the frontline of data science, by Michael Hollaway, Tom August and Michael Tso as part of the NERC supported Webinar Series on Constructing a Digital Environment (https://digitalenvironment.org/cde-webinar-series/)
Year(s) Of Engagement Activity 2022
URL https://digitalenvironment.org/cde-webinar-series/#HollawayAugustTso
 
Description Virtual labs and digital environments: can virtual lab technology support a paradigm shift towards a more open, collaborative and integrative environmental science? 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact Seminar presented on "Virtual labs and digital environments: can virtual lab technology support a paradigm shift towards a more open, collaborative and integrative environmental science" By Prof. Gordon Blair and Dr. Michael Hollaway as part of the NERC supported Webinar series on Constructing a Digital Environment (https://digitalenvironment.org/cde-webinar-series/).
Year(s) Of Engagement Activity 2021
URL https://digitalenvironment.org/cde-webinar-series/#GordonBlair