Predictive text mining for global biodiversity indicators and models.

Lead Research Organisation: Imperial College London
Department Name: Life Sciences

Abstract

Biodiversity indicators and models are important for policy decisions, scientific understanding and public engagement. Although advances in remote-sensing are revolutionising the production of indicators of ecosystem structure and function, biodiversity indicators still rely on bottom-up aggregation and synthesis of data from many local surveys. Current ways of doing this are very time-consuming because finding and extracting them from the burgeoning literature is still a slow, manual process.

The PREDICTS (based at the NHM) and LPI (based at the ZSL) databases are global in coverage and their associated biodiversity indicators are internationally recognsed. However, both datasets have key spatial and thematic gaps. These gaps undermine attempts to produce indicators that are representative of biodiversity, rather than reflecting the data's geographic, taxonomic or ecological biases.

This project will use new developments in text mining and machine learning to greatly increase the rate of data flow into the above datasets, with the potential for preferentially targeting data concerning taxa and geographic regions which are currently under-represented. Subsequently, recalculating the associated indicators using the enhanced databases will provide a better overview of the state of the natural world and faciliate a comparison with the values obtained from the current, manually collated data. Given the broad ecological coverage of the two databases, further avenues of research will also be possible.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
NE/R012229/1 01/10/2017 01/05/2024
2004077 Studentship NE/R012229/1 15/01/2018 15/10/2021 Richard Cornford
NE/W503198/1 01/04/2021 31/03/2022
2004077 Studentship NE/W503198/1 15/01/2018 15/10/2021 Richard Cornford
 
Description Poster presentation - British Ecological Society Annual Meeting 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Postgraduate students
Results and Impact I presented a poster concerning my work on the spatial projection of vertebrate population trends.

I was able to discuss the potential issues of using predictive models without validation with a number of interested individuals and thus raise their awareness of the topic.
Year(s) Of Engagement Activity 2019
 
Description Presentation at a Cross-Government Data Science Community Meet-up 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Professional Practitioners
Results and Impact I was invited to talk about my work at a community of interest event focused on the use of data science across the government.

My presentation contained information about the importance of biodiversity indicators/databases and the challenges associated with collating the required data, before moving on to discuss my work which seeks to address this issue. Specifically I spoke about my use of text-mining and machine learning to increase the rate at which literature relevant to ecological datasets can be discovered.

I was able to discuss my work with various data science practitioners present at the event and my talk will be mentioned in an upcoming, internal government blog.
Year(s) Of Engagement Activity 2020
 
Description Presented a poster at the Natural History Museum's Student Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Postgraduate students
Results and Impact I presented a poster outlining my research and was able to talk to multiple other PhD/Master's students about my work.
Year(s) Of Engagement Activity 2020
 
Description Presented research progress to my CDT 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Postgraduate students
Results and Impact I presented work on the spatial transferability of ecological models and how this impacts out ability to predict biological responses to anthropogenically driven environmental change.
Year(s) Of Engagement Activity 2020
 
Description Virtual presentation at the Evidence Synthesis and Meta-Analysis in R Conference 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Other audiences
Results and Impact I presented research from my published work, discussing the usse of automated appraoches for finding ecological datasets in the literature.
After the talk I took part in a panel discussion about the use of automated appraoches in research syntheses and answered questions from the audience.
Year(s) Of Engagement Activity 2021
URL https://www.youtube.com/watch?v=upMLe6_khDk