Photo Identification of Marine Cetaceans Using Convolutional Neural Networks

Lead Research Organisation: Newcastle University
Department Name: Sch of Computer Science

Abstract

Modelling cetacean population dynamics and behaviour is paramount to effective population management and conservation. Robust data is required for the design and implementation of conservation strategies and to assess risk presented by anthropogenic activity such as offshore wind turbines and fishing. Moreover, cetaceans (whales, dolphins and porpoises) make prime candidates for modelling ecosystem change under the ecosystem sentinel concept as they reflect the current state of the ecosystem and respond to change across different spatial and temporal scales. As global climate changes and urbanisation of coastal areas intensifies, it is imperative to develop methodologies for quick and effective assessment of the biological and ecological impact of rising sea temperatures, pollution and habitat degradation. This can be achieved through modelling the population, behaviour and health of large marine species such as dolphins.

Methodologies of cetacean research includes photo identification (photo-id). Photo-id involves collecting photographic data and identifying individuals based on unique permanent markings and has been used for more than 40 years for modelling cetacean population dynamics and ecology. However, it is costly, time consuming, and requires good survey conditions and manpower. It can also lead to biased behavioural studies as vessel presence impacts natural behaviour and data collection is limited to what observers see at the surface. This project addresses these limitations by applying the methodologies, techniques, and computational power of deep learning to the field of marine biology. Once trained, models can be run on field deployable computers to perform image analysis in real time from multiple data sources. Methodologies incorporating these models will be designed to quickly identify individuals, assess health, analyse behaviour and incorporate remote sensing techniques.

Aims and Objectives

The project will develop novel image analysis algorithms which can be deployed on small low-cost Linux based computers, which can be taken into the field to perform analyses in real time. These algorithms will be developed to work on three different image types of white beaked dolphin (WBD); above water from a waterborne vessel, underwater, and aerial drone. The process for initial testing has already begun, using a preliminary image database. This database consists mainly of from-boat images of Indo-Pacific bottlenose dolphins collected during a previous expedition to Zanzibar, Tanzania, along with more recent images collected of WBD. Some models have already provided useful preliminary results, and have been trained to locate WBD fins above the surface of the water, see Figure \ref{fig:boxed}. Once models have been developed which provide an acceptable level of accuracy on the test set, demonstration of the whole system approach will take place via fieldwork in the North Sea during which the system will be deployed in real-time. The efficiency and effectiveness of the system will be demonstrated through measurements of how well it identifies and catalogues known pods from the catalogue of individuals already has produced.

Key Hypotheses

This project will test multiple key hypotheses. These can be outline as:

- The use of deep learning will aid biologists and be beneficial in the speed up of identification of marine cetaceans.
- WBD, and in the future other marine cetaceans, can be identified through the use of deep learning algorithms trained to recognise specific individual patterns, such as scratch and bite marks, as well as individual colour markings.

Planned Impact

The CDT will have impact in a range of areas:


Industrial and Public Sector Impact

The Centre's main impact will be made through its graduates: it will develop highly skilled researchers with the theoretical and practical skills to transform existing organisations, and create successful new companies.

We have already obtained commitment from 30 partner organisations both large and small, regional, national and international, who wish to work closely with the CDT (as evidenced by the letters of support). Impact on them will come through students working on projects specified by partners, students being placed with partners during their PhD, and ultimately through students moving into positions of influence in organisations when they graduate.
The norm for all software developed in the CDT will be to release it as open source so that it can be exploited by industry. In our experience this can attract companies and be a catalyst for productive collaboration - code from our previous projects has been widely used internationally.


Economic Impact

The global cloud computing market is expected to grow from $38 billion in 2010 to $121 billion in 2015 (M&M, 2013). Working productively with partners will maximise the chances of economic impact, which will come through organisations using their newfound skills, expertise and tools to realise their potential to transform themselves.

UK industry faces a huge skills gap in this area. Demand for big data staff has risen exponentially (912%) over the past five years from 400 advertised vacancies in 2007 to almost 4,000 in 2012 (e-skills UK, Jan 2013). Over the next five years analysts forecast a 92% rise in the demand for big data skills with around 132K new jobs being created in the UK (e-skills UK, Jan 2013). The CDT will provide expert practitioners to fill this gap.

The reason Newcastle City Council is setting up the £2M cloud business engagement facility that will be co-located with the CDT is that it believes that it can transform the local economy by up-skilling existing workers. This investment brings funding for CPD, cloud events and other outreach activities that will disseminate the knowledge developed in the CDT.


Societal Impact

We will build on the knowledge and pathways created in the Social Inclusion through the Digital Economy Hub (SiDE: 2009-15), which is tackling big data challenges across a range of areas of societal importance e.g. healthcare and mobility for older people. We will build on our existing, long-term relationships with SiDE partners; maintain our links with organisations that represent disadvantaged groups; and work directly with users through the 3000 person User Pool created by the SiDE project.

The CDT also has a strong set of investigators tackling key healthcare challenges through the use of cloud computing in medicine, biology and neuroscience. These subjects are now under a deluge of data, and increasingly researchers (including those in the pool of potential supervisors for this CDT) are using cloud computing to extract knowledge from it.

An annual public engagement open-day will disseminate the CDT's work to a diverse audience.


Academic Impact

Academic impact will come from the graduating students (some of whom will stay in academia), ideas (through publications), the publication of open source software and our delivery of training courses to other CDTs and researchers.

The placing of CDT students at our overseas partner Universities - Berkeley and PUCRS, Brazil (please see letters of support) - will provide a way for our student's research to have direct international impact.

Publications

10 25 50