Biodiversity indicators from nonprobability samples: Interdisciplinary learning for science and society

Lead Research Organisation: UK CENTRE FOR ECOLOGY & HYDROLOGY
Department Name: Biodiversity (Wallingford)

Abstract

Understanding the global biodiversity crisis requires regular monitoring and reporting. Scientists use a combination of biodiversity data and statistical methods for this purpose. Biodiversity data, however, are not often representative samples of reality. Other research areas have been dealing with similar issues for many years, such as when political scientists try to predict election outcomes from unrepresentative public polling. Accounting for such evidence quality issues is an essential part of the maturation of the use of "big data" in ecology, particularly as research outputs are increasingly being called upon to evaluate both international targets (e.g. those linked to the Convention on Biological Diversity) and national government policies. For example, the forthcoming UK Environment Act is planning to use ecological indicators to both set, and evaluate progress towards, targets relating to the state of the environment. Whilst such indicators have long been used as "official statistics" to inform government, this direct link to legislation is new. Given all the subsequent decisions that this usage might entail (e.g. funding for conservation), accurate appraisals of our environment, including adjustments for unrepresentative sampling, are clearly essential.
At the same time, the growth of digital communication and IT has created opportunities to visualise and disseminate patterns in data like never before. Even within the recent past the COVID pandemic has increased the rate at which the public are presented with charts and data. Parallel to this, there has been a steady growth in public interest in the environment, with organisations such as the Intergovernmental Science-Policy Platform on Biodiversity and Ecosystem Services (IPBES) and environmental charities now keen to summarise and present the "state of nature" to the public to bolster their understanding of ecological issues. Trends in quantities that are considered to indicate the health of some part of our environment are a significant part of this, and are regularly published, promoted, and extensively shared. Such trends are often used as "ecological indicators", i.e. numbers that directly indicate some change in our environment that we wish to manage or simply understand, an area with a long history of research in ecology. Communicating uncertainty around such metrics is a fundamental part of keeping the public informed about the true state of scientists' knowledge about biodiversity change.
What is not often considered, however, is the quality of the evidence used to create such statistics. In the UK, most biodiversity indicators are based on amateur naturalist activity, which, whilst frequently of very high quality, is not often the result of random sampling. Globally, data are highly heterogeneous, and even professional monitoring data become unrepresentative at this scale (i.e. there is no overall random sample of earth's biodiversity). However, the robust estimation of time trends in species' distributions or abundances requires representative data. This is ultimately a statistical problem, common to all sciences that wish to understand reality from samples. Random samples are at the heart of strong statistical inference, and so departures from this condition should give us pause for thought. Luckily, statisticians have put much effort into considering how nonrandom samples can be made more reliable, and a rich collection of advice and technical methods from other research areas is available to this end. Our project will investigate this set of techniques to highlight ways in which the ecological evidence base underpinning our knowledge of the current biodiversity crisis can be improved, and how this uncertainty can be accurately and clearly communicated to policymakers and the public.
 
Description Biodiversity indicators used to establish and inform government targets are often based on spatially biased samples, and these biases may even change through time, we have highlighted the pervasive nature of this problem and demonstrated techniques from other disciplines, including opinion polling, that can be used to address these issues, leading to more accurate representations of the state of our knowledge of the environment.
Exploitation Route Use of methods to adjust existing biodiversity indicators (e.g. see https://jncc.gov.uk/our-work/ukbi-c7-plants-of-the-wider-countryside)
Sectors Environment

 
Description Biodiversity indicators are used to monitor the impacts of government policy and environmental change, and to inform the public regarding the state of the environment. Our work has highlighted the pervasive issues concerning spatio-temporal biases in such indicators, and has highlighted a number of methods from other disciplines that could be used to address this issues. We have also demonstrated to governmental bodies (JNCC, Defra) how such adjustments can change conclusions regarding species' trends, and are currently working on implementing these for at least one headline indicator (https://jncc.gov.uk/our-work/ukbi-c7-plants-of-the-wider-countryside).
First Year Of Impact 2023
Sector Environment
Impact Types Societal

 
Description Adjusting government biodiversity indicators for bias
Geographic Reach National 
Policy Influence Type Contribution to new or improved professional practice
URL https://jncc.gov.uk/our-work/ukbi-c7-plants-of-the-wider-countryside
 
Title Bias adjustments for biodiversity monitoring 
Description Biodiversity monitoring usually involves drawing inferences about some variable of interest across a defined landscape from observations made at a sample of locations within that landscape. If the variable of interest differs between sampled and nonsampled locations, and no mitigating action is taken, then the sample is unrepresentative and inferences drawn from it will be biased. It is possible to adjust unrepresentative samples so that they more closely resemble the wider landscape in terms of "auxiliary variables." A good auxiliary variable is a common cause of sample inclusion and the variable of interest, and if it explains an appreciable portion of the variance in both, then inferences drawn from the adjusted sample will be closer to the truth. We applied six types of survey sample adjustment-subsampling, quasirandomization, poststratification, superpopulation modeling, a "doubly robust" procedure, and multilevel regression and poststratification-to a simple two-part biodiversity monitoring problem. The first part was to estimate the mean occupancy of the plant Calluna vulgaris in Great Britain in two time periods (1987-1999 and 2010-2019); the second was to estimate the difference between the two (i.e., the trend). We estimated the means and trend using large, but (originally) unrepresentative, samples from a citizen science dataset. Compared with the unadjusted estimates, the means and trends estimated using most adjustment methods were more accurate, although standard uncertainty intervals generally did not cover the true values. Completely unbiased inference is not possible from an unrepresentative sample without knowing and having data on all relevant auxiliary variables. Adjustments can reduce the bias if auxiliary variables are available and selected carefully, but the potential for residual bias should be acknowledged and reported. 
Type Of Material Data analysis technique 
Year Produced 2023 
Provided To Others? Yes  
Impact Improved practice for UK Biodiversity Indicators is underway (e.g. see https://jncc.gov.uk/our-work/ukbi-c7-plants-of-the-wider-countryside) 
URL https://doi.org/10.5281/zenodo.10029669