Engineering Transformation for the Integration of Sensor Networks: A Feasibility Study - 'ENTRAIN'

Lead Research Organisation: UK Ctr for Ecology & Hydrology fr 011219
Department Name: Water Resources (Wallingford)

Abstract

There is a need to make use of new digital data analysis techniques to improve our understanding of the environment. Data from a new generation of environmental sensors, combined with analyses based on Artificial Intelligence, has the potential to help us understand from human influences and long-term change are affecting the environment around us. Artificial Intelligence approaches enable computers to identify trends and relationships across different streams of data, often picking out patterns that would be too difficult or time-consuming for humans to identify manually.

To realise these benefits, data from diverse sensor networks must combined and analysed together. Currently many sensor networks are operated individually, and data are not readily combined due to differences in the way measurements are made (e.g. between weekly river samples and sub-second measurements of gases in the atmosphere). In addition, to combine these data in an automatic way without human intervention requires much finer and more consistent descriptions of the contents of data streams, so that machines can understand the content sufficiently. Links between sensors in space are also important, and machines will need an understanding of these links, not just in the sense of coordinates, but for example how sensors are linked along rivers. We can construct a digital representation of rivers in order to enable this.
We will describe the various elements of a future environmental analysis system that will be required in order to achieve these benefits, and addressing some of these currently missing components. We will look at technologies, from databases to data transfer mechanisms, to understand how a system could be built.

We will use data from 3 NERC sensor networks measuring environmental variables from the atmosphere to river water quality, and show how this data can be automatically integrated in such a way that machines would be able to analyse it automatically.
A significant issue when monitoring with high-resolution sensors is how to handle problems in the data, which could include missing data, and erroneous values due to sensor failure. There is too much data for humans to manually view and check, and so automated approaches are needed. Currently these are often simple checks of individual data values against expected ranges, but again there are opportunities for artificial intelligence to improve this. AI approaches can look across multiple sensors, identify relationships, and find subtle changes in data signals, and this can be used to both identify data problems and to fix them through infilling. We will enhance the 3 NERC networks by testing and applying such approaches to data quality control.

We will investigate some fundamental limitations of high-resolution monitoring, the transfer of large amounts of data from the field site to the data centre, the security of such systems, and whether more processing could be done on the instruments themselves to reduce data transfer volumes.

We will meet with the public, with policy-makers, with industry and with researchers to discuss where there will be most to be gained from development of AI approaches to analysing environmental sensor data. We will develop ideas for future work to realise these gains, and will promote the benefits of an integrated system for environmental monitoring. These stakeholders are likely to include the Environment Agency, SEPA, Natural Resources Wales, Defra, Water companies, sensor network developers, and public organisations with an interest in the environment, including the National Trust, the Rivers Trusts, and local community groups.

Planned Impact

The Digital Environment programme will benefit from ENTRAIN's foundation work, providing requirements, methods, best practice advice and recommendations for integration and data modelling across multiple sensor networks and other datasets, such as EO. This information and techniques will benefit other areas of science and industry, as well as public engagement.

Environmental practitioners, regulators, government, consultants, the water industry, agribusiness, insurance, and many others can benefit substantially from the joined-up evidence that ENTRAIN will start to generate.

The public, schools and colleges will benefit from access to meaningful data in a spatially aware context. There is strong public interest in environmental issues, and yet beyond weather forecasts, weather data and perhaps more recently air quality readings, there are few accessible data which have tangible meaning to the layperson.

The Earth Observation (EO) community will benefit from better access to connected in situ data for retrieval algorithm validation & development. E.g. flooding extent, soil moisture and land cover products etc. and from new automated Phenocam greenness products output by ENTRAIN. Other Big Data projects (e.g. Data Labs) will benefit from a greater and easier connection to datasets, with common spatio-temporal linking requirements and proper metadata description already done.

Data modellers, and environmental informatics will benefit from improved sensor metadata schemes, sensor registers/catalogues, and data structuring for interoperability of a network of networks. We will disseminate the results of the ENTRAIN feasibility study also through an environmental informatics paper, describing the proposed methods and advances made in data modelling and data provenance, to ensure wide impact and uptake of these methods.

Observational scientists and regulatory observers will benefit from our website case studies, webinars, training workshops and online video tutorials to disseminate best practice and training in realtime data collection, cyber security, data vocabularies, metadata schemes and new deep learning QC techniques. These studies will benefit the global environmental, and wider, data communities such as the Committee on Data for Science and Technology (CODATA). Harmonisation of data networks to increase or facilitate interoperability, and production of spatio-temporally connected observations across environmental domains (e.g. Digital Rivers), will benefit the British Geological Survey (BGS), CEH, UK MetOffice, the Environment Agency (EA), the Scottish Environmental Protection Agency (SEPA), Defra, water companies (and other utilities such as the power grid), argi-business, insurers, and public health. Other NERC and EPSRC funded programmes (e.g. ASSIST, Natural Flood Management, HydroJULES, Internet of Food Things) and Defra Air Quality Monitoring Networks will all benefit through higher data quality, more complete data and efficiency gains in analysing data across sensor networks, and by using developed data structures in other environmental monitoring and food chain domains. The spatially connected integrated data visualisations will be demonstrated at both academic and industry events and conferences, where practitioners can benefit from e.g. improved water quality alerts (e.g. algal blooms, nutrient levels etc.), and crucially see the connected drivers of those trends, and get decision support information. Similarly, this has the potential to provide interconnected, cross-discipline, environmental management information that can give government the evidence chain to implement, monitor and evaluate policy.

We will monitor and evaluate the success of our impacts by recording attendances at events, the number of website visits, and use Twitter to promote activities, whilst recording the number of followers and retweets. We will also log email enquiries, webinar and web video views.

Publications

10 25 50

Related Projects

Project Reference Relationship Related To Start End Award Value
NE/S016244/1 01/04/2019 30/11/2019 £251,614
NE/S016244/2 Transfer NE/S016244/1 01/12/2019 30/06/2020 £62,903