14TSB_DataExpl Crowd-Sourced Prediction of Plant Pest and Disease Occurrence using Mobile Apps

Lead Research Organisation: University of York
Department Name: Computer Science


Growing Interactive, the industrial lead partner of the project, produces the leading on-line software and apps that gardeners and small-scale farms use to plan the edible crops they grow and achieve increased levels of success. Over a quarter million gardeners and farmers have used their software and apps. They hold a wealth of location-based information on which crops, varieties and quantities gardeners are growing in their location and this year they are extending the recording to include dated observations of pests and diseases on crops (launching May 2014). Event-based journalling has been the most requested new feature for their software and they will be adding social feedback elements to reward reporting.

We propose that this data be statistically analysed in conjunction with meteorological information to develop predictive models for pest and disease emergence on crops. By developing advanced map-based visualisations, the vast quantity of crowd-sourced data can be analysed in depth and used to refine predictive models. Meteorological information and weather forecasts can then provide significantly improved pest prediction for growers for the current growing season specific to their location.

Technical Summary

Growing Interactive (GI), the industrial lead partner of the project, produces leading online software and apps that gardeners and small-scale farms use to plan the edible crops they grow and achieve increased levels of success. Over one quarter million gardeners and farmers have used the software and apps. GI holds a wealth of location-based information on which crops, varieties and quantities gardeners are growing in their location and they are extending the recording to include dated observations of pests and diseases on crops.

This project will statistically analyse the crowd-sourced pest and disease observations recorded in their apps in conjunction with meteorological information to develop much more accurate predictive systems along with innovative visualisations to help predict pest and disease emergence on crops. Meteorological information and weather forecasts can then be used to provide significantly improved pest prediction for growers for the current growing season specific to their location. These will form the basis of new services for GI's customers, the horticultural industry and agriculture, helping growers to reduce losses due to plant pests and diseases.

Planned Impact

1. Economic Impact: Within the consortium, supplying data feeds from the pest prediction models opens up new markets - an estimated 33% increase in revenue to Growing Interactive, growing to 43% over 5 years and new data products. Beyond the consortium, increased efficiency (though decreased losses) in agriculture has wide ranging benefits, lowering prices and providing even greater benefit to the economy as a whole. On completion of the project, we would like to extend the system to developing countries where mobile phones deliver the internet, working with NGOs to tackle the problem of insect damage in subsistence farming.

2. Social Impact: The 2008 Cabinet Office Study 'Food Matters' highlights the health gains associated with fresh produce:
"Reaching the 5 A DAY target for fruit and vegetable consumption could mean that around 42,000 premature deaths are avoided each year... [the Government] will adopt a specific target of increasing fruit and vegetable consumption in low-income young families." [section ES36]. Growing Interactive's garden planning apps, when combined with the pest and disease prediction service, dramatically increase the success of home growing, providing access to fresh, healthy produce at minimal cost. Studies have also shown that growing food brings substantial benefits across a wide range of health and social issues.

3. Environmental Impact: The 'Food Matters' study states: "The food chain has huge environmental impacts. Reducing the food chain's dependence on energy, water and other resources will reduce its exposure to future increases in resource prices. Reducing the quantity of waste and GHG emissions can improve resource efficiency and anticipate the changes required for the transition to a low-carbon economy."
Improved plant pest and disease prediction impacts almost all of the factors which contribute to the food system's environmental impact: increased efficiency through decreased crop spoilage, reduced exposure to resource price fluctuations, less dependence on expensive pesticides and fungicides and the negative environmental impact of such inputs. More localised pest predictions help make local food systems more productive and competitive, lowering environmental impact as transportation costs are reduced. Pest and disease prediction is particularly useful for organic
production techniques, which by nature have considerably lower environmental impact and form an important part of the transition to a low-carbon economy.


10 25 50
Description 1. We developed a novel automated methodology to derive and select the most appropriate environmental variables when predicting ecological phenomena, while simultaneously improving the accuracy of such models. 2. We supported the development of the Big Bug Hunt (bigbughunt.com), an international initiative that asks private gardeners to report garden pest observations. Since its launch in 2016, tens of thousands of reports have been received, and this data will be used to generate new predictive models of pest spread. 3. We supported the development of a visualization tool that shows the spread of garden pests on a geographical map (developed by Growing Interactive, the industrial project partner).
Exploitation Route We have provided a model of Aphid pest spread to Rothamsted Insect Survey. Also, our model generation methodology will be incorporated in the apps of Growing Interactive. The app is aiming to send out warnings of possible pest arrival by email. Initially this will be identifying when a pest has been spotted nearby but the long-term goal (not yet met by this first-round research) is prediction based on ML systems 'learning' the meteorological and spacial conditions that increase probabilities of pest occurrence.
Sectors Agriculture, Food and Drink,Environment,Other

Description Our methodology for predictive model learning will be used In the gardening apps of our partner, Growing Interactive. These will provide users with alerts on potential garden pests, so that early preventive measures can be taken. Such early measures will reduce the use of pesticides, and thus reduce pollution.
First Year Of Impact 2018
Sector Agriculture, Food and Drink
Impact Types Societal

Title Data Representation in Ecological Phenomena 
Description Selecting the 'best' environmental drivers is an imperative phase in generating predictive biological models; however, environmental variables are often selected with little consideration to the temporal scales at which they operate. Machine learning approaches have the potential to identify the most informative time windows to select environmental variables, whilst simultaneously improving species predictions. We used a C5.0 boosted decision tree model along with entropy to identify the most informative temporal extents and resolutions of the meteorological variables responsible for flight patterns in 51 UK aphid species. Decision tree models significantly (a<0.01) improved the accuracy of first flight prediction by upwards of 20% compared to general additive model implementations, and entropy selected variables significantly (a<0.01) improved the accuracy by a further 3-5% compared to expert derived variables. 
Type Of Material Data analysis technique 
Provided To Others? No  
Impact Research is currently under review as a manuscript at Ecological Indicators. Discussions with collaborators (Rothamsted Research) about potentially implementing these models in their predictions, which would result in potentially more accurate predictions. 
Title Pseudo-Absence Generation 
Description A key challenge with crowd-sourced data is that reporters only document the presence of a pest. However, the prediction of 'presence' or 'occurrence' when using machine learning techniques requires response variables that represent 'absence' in order to categorise observations. Reliable reports of 'absence' are difficult to attain. Was the pest really absent or did the user just not observe it, or was it just not there at the time of sampling? Pseudo-absences provide a comparative data set to enable the conditions under which a species occurs to be contrasted to those where it is absent (or at least the full range of environmental conditions within the species' geographic range). The majority of methods developed to generate pseudo-absences focus on the spatial aspect of where these are generated and how they are weighted. The time-stamp associated with crowd-sourced observations, coupled with the overall aim of this project to develop 'real-time' pest predictions, means that pseudo-absences need to be generated in both spatially and temporally explicit circumstances. Our preliminary research identifies that the incorporating time into these predictions can have large implications for the predictive outputs, and this is an avenue that we are currently exploring further. 
Type Of Material Data handling & control 
Provided To Others? No  
Impact N/A 
Description Rothamsted Insect Survey, Rothamsted Research 
Organisation Rothamsted Research
Department North Wyke Farm Platform
Country United Kingdom 
Sector Private 
PI Contribution We used our expertise in machine learning (ML) and ecological modelling to explore predictions of aphid first flight predictions. In particular, we took the data from James Bell and colleagues' 2015 research article in the Journal of Animal Ecology, used machine learning techniques to derive new meteorological variables that indicated flight patterns. Our research resulted in an increase in accuracy of approximately 20% when predicting short-term aphid first flight dates compared to the methods implemented by Rothamsted. We are now exploring the benefit of these models to make long-term predictions of aphid flight patterns, a commercial product that Rothamsted produces.
Collaborator Contribution Rothamsted Research, through James Bell, provided us with with 30 years (1980-2010) of aphid flight metrics from 17 sites around the United Kingdom. We have held various meetings with James where he has used his expertise in aphid phenology to help inform us on the importance of meteorological variables to different species and identify 'high-risk' species that should make our research have national and international impact.
Impact We submitted a research manuscript to Ecological Indicators on February 8th 2017. This article, co-authored by Paul Holloway, Daniel Kudenko, and James Bell, is still currently under-review. This collaboration is multi-disciplinary, between computer science, entomology, geography and ecology.
Start Year 2016
Description Crowd-Sourced Data Website and Campaign 
Form Of Engagement Activity Engagement focused website, blog or social media channel
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Public/other audiences
Results and Impact The Big Bug Hunt website (http://bigbughunt.com) was created for crowd-sourced data acquisition from gardeners around the world.
In addition a 'widget' version of the reporting tool was developed and placed on major gardening websites such as GrowVeg.com, Suttons.co.uk, Almanac.com and MotherEarthNews.com.
Weekly content (during the gardening season) was (and continues to be) produced including newsletters, bug identification guides, videos and other engaging content such as top reported pests and research news.
A major press-release and media awareness campaign was launched in 2017 and commitments to print have been obtained from major gardening media, including the RHS magazine, Gardeners World magazine and a national newspaper in the UK. Further afield, the Big Bug Hunt has been covered on international blogs (e.g. bugeric.blogspot.com) and a major press release campaign is currently being organised in the US.
Over 11,000 bug reports have been generated through this engagement activity and this rate is growing each year.
Year(s) Of Engagement Activity 2015,2016,2017
URL http://bigbughunt.com