Machine Learning for Bird Song Learning

Lead Research Organisation: Queen Mary University of London

Department Name: Sch of Biological and Chemical Sciences

Abstract

Songbirds, including familiar species like chaffinches and great tits, share an unusual ability with us: vocal learning. Like us, birds need to hear and imitate others in order to develop their vocal communication signals. Most mammal and vertebrate species cannot do this, including all other primate species apart from us. In recent years, research into the development, neurobiology, and genetics of song learning have revealed ever deeper links between human speech and bird song - so much so that bird song currently represents the best animal model we have for understanding the biology of speech.

In order to study bird song, researchers need to accurately measure how different songs are from each other. These measures are needed to assess whether one bird really did imitate another, and how precisely they did so. Developing computer algorithms to make such measurements is difficult, however, for many of the same reasons that speech recognition is a difficult task for computers. In this grant, we will use a new approach to solve this problem - inspired by developments in speech recognition. First we will train birds to peck on buttons to get a food reward from a bird feeder, and then train them further to discriminate between different "notes" within bird songs. Then we will train "machine learning" computer algorithms to replicate the birds' decisions. We will thus develop a computer algorithm that we can use to compare bird songs in a way that is biologically validated.

We will then use our algorithm to investigate how birds learn their songs. To do this, we will make use of data-sets where researchers have simply recorded the different songs sung by birds within the population. This data contains a signature of how the birds actually learned their songs in much the same way that our genomes contain signatures of our evolutionary history. We will exploit this by using a statistical technique in combination with simulation models to infer how birds learn their songs: how frequently they generate new song types due to errors or innovations; who they prefer to learn from; and which songs they prefer to learn. We will do this for 15 different species and populations, allowing us to compare how different groups learn their songs for the first time.

Technical Summary

Bird song learning research has been built on our ability to judge the similarity between song syllables, but current methods have not been validated against birds' own perception. In order to carry out the next generation of studies of song learning, we need to develop more accurate methods, rooted in biology. And to do that, we first need comprehensive data-sets of how birds themselves perceive differences in song syllables.

Objective 1: Generate data-sets for how birds perceive differences between song syllables using operant conditioning methods, using an AXB task, for three unrelated species: zebra finch, great tit and jackdaw. We will generate around 150,000 trials.

Objective 2: Develop and train machine learning algorithms to measure song syllable similarity. Recent developments in machine learning provide powerful methods for fitting algorithms to complex time series data, like bird song syllables. We will develop and train algorithms using the results from Objective 1. We will compare the performance of our algorithm against current methods, and will host a data tournament for the machine learning field to further search for optimal solutions.

Objective 3: Apply the machine learning algorithms developed in Objective 2 to a fundamental problem in bird song learning: we lack quantitative estimates for how precisely birds learn songs. Without this information, it is impossible to take advantage of the diversity of bird song learning styles in different species and gain a comparative understanding of how song learning behaviour evolves. For this objective, we will (a) collate patterns of song sharing in populations of birds of 15 different taxa; (b) compare syllable structure of all songs within each of the populations using our algorithm; (c) use Approximate Bayesian Computing to fit the results to cultural evolutionary simulations, and thus estimate underlying parameters of learning - in particular the precision of syllable imitation.

Planned Impact

We will generate a state-of-the-art method for comparing the similarity of bird songs, and a data-set for other researchers to use when developing their own methods. Our method will be incorporated into a song-analysis program (Luscinia) that will be readily useable by members of the research field. Research that will benefit from these methods has the following impacts:
(a) Biomonitoring. Bird song is often the best record that we have of avian biodiversity - especially in tropical forests where biodiversity is highest and visibility of birds very limited. Processing hours of song recordings manually is a difficult and skilled task, and recently, interest has grown in computational methods that can automate the task. Our project will add to this by developing the first method validated by avian perception itself. Both R-Co-I Stowell (developer of Warblr), and PI Lachlan (developer of Luscinia) have a proven track record in implementing computational bioacoustic techniques for a broader audience.
(b) Biodiversity. Song often provides one of the critical phenotypic cues needed to identify new species. In some cases, song is the only clear and unambiguous character. To use song features to distinguish taxa, an accurate way to quantitatively compare songs is required; we will create and make this available to the field via the Luscinia software. The less sophisticated measures already implemented in Luscinia have already been used for this purpose, helping to identify the Gran Canarian Blue Chaffinch as a separate species from the Tenerife Blue Chaffinch, and in so doing, discovering the rarest, and one of the most endangered bird species in the E.U. Other labs are currently carrying out similar studies in Colombia and Tanzania amongst other places.
(c) Bird song neuroscience. Bird song is an established model system for speech, at a neurobiological and genomic level. Genes involved with bird song learning have been implicated in human disease. Research into this field requires accurate assessments of song structure and song similarity, which we will deliver. Through PI Lachlan's work on Luscinia, and co-PI Clayton's senior position in the bird song neurobiology field, we again have a clear plan of how we will make our methods available to a broader field and advertise them.

Funded Value:

£535,796

Funded Period:

Jun 18 - Mar 19

Funder:

BBSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

BB/R008736/1

Principal Investigator:

Robert Francis Lachlan

Research Subject:

Animal science (60%)

Ecol, biodivers. & systematics (20%)

Psychology (20%)

Research Topic:

3Rs (20%)

Animal behaviour (20%)

Behavioural Ecology (20%)

Biological Psychology (20%)

Psychology (20%)

Organisations

People	ORCID iD
Robert Francis Lachlan (Principal Investigator)
David Clayton (Co-Investigator)
Daniel Stowell (Researcher Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Bear H (2022) An evaluation of data augmentation methods for sound scene geotagging

Bear H (2021) An evaluation of data augmentation methods for sound scene geotagging

Bear H (2021) An Evaluation of Data Augmentation Methods for Sound Scene Geotagging

Morfi V (2021) Deep perceptual embeddings for unlabelled animal sound events. in The Journal of the Acoustical Society of America

Zandberg L (2022) Bird song comparison using deep learning trained from avian perceptual judgments

Zandberg L (2021) Global cultural evolutionary model of humpback whale song. in Philosophical transactions of the Royal Society of London. Series B, Biological sciences

Zandberg L (2024) Bird song comparison using deep learning trained from avian perceptual judgments. in PLoS computational biology

Related Projects

Project Reference	Relationship	Related To	Start	End	Award Value
BB/R008736/1			03/06/2018	31/03/2019	£535,796
BB/R008736/2	Transfer	BB/R008736/1	31/03/2019	02/06/2021	£426,688

Key Findings
Impact Summary
Further Funding
Research Databases and Models
Research Tools and Methods
Engagement Activities


Description	We have been able to collect a large database on how zebra finches perceive sounds. Specifically, we have collected 20,000 trials of data about whether a bird perceives one sound to be more similar to Sound A or Sound B. To do this we used a novel "robotic bird feeder" that allowed us to work with animals in an aviary setting, with little disturbance to their normal life. We have augmented this by collecting 5,000 similar trials with captive great tits. We have also developed a machine learning algorithm that uses this data to mimic bird song perception: we can put in two novel sounds, and the algorithm will report how differently we would predict a bird would judge them to be. We have applied this algorithm to our dataset of bird decisions, and arrived at a trained algorithm that does indeed mimic bird perception. This algorithm more closely fits birds' decisions about song similarity than the current state of the art in the field. Our results suggest that while bird song research has been using algorithms that are somewhat related to how birds themselves perceive sound, we can do better in this regard. As part of another objective in the grant, we have developed simulation models to explore how animal vocalisations culturally evolve. We have applied this to humpback whale song. Our simulations are able, for the first time, to explain why "revolutionary" patterns of song change occur in southern hemisphere populations, while they don't in the northern hemisphere. We have also collected song variation data from >20 species of birds and have fitted cultural evolutionary models to them, using our new knowledge about song perception. We have revealed that different species of birds do indeed learn with very different degrees of precision.
Exploitation Route	Our work will be used in software programmes that are used to compare bird songs. In particular, Luscinia (https://rflachlan.github.io/Luscinia/) is already being shaped by our findings. This software is used by a large number of researchers around the world investigating bioacoustics from ecological through to neurobiological perspectives. It will place the scientific findings from these studies on a firmer footing, and should also improve the quality of data produced. Some of these research fields are highly applied. For example, bioacoustics is widely used in biomonitoring studies. Our work on humpback whale song helps deepen understanding of communication and change in an important species recovering from near extinction. Notably, our results show how changes in population size since the whaling-induced population crash are related to patterns of singing.
Sectors	Environment Healthcare


Description	The PI has been invited to a workshop organised by National Geographic (Feb 26th-27th 2019) to discuss how animal culture can be communicated to the general public, and in particular, how animal culture can be used to shape the public's understanding of the value of animal populations. This follows on from an article published in National Geographic by the PI in July 2018 on the basis of research methods developed by the PI and that are central to this grant. These methods allow us to track how animal traditions change over time. The R-Co-I has presented work from this grant in the Soapbox Science series. Work on the grant has led to the successful development of an operant device that improves animal welfare by allowing lengthy behaviour experiments to be carried out by animals living in normal group environments as well as in the wild. We have already produced a benefit in our own work - collecting data with minimal welfare impact, that previously would have involved significant time in isolation. But after communicating about our work in several impact activities, we have also shared expertise and have already supplied versions of our device to other labs around the world. We have further developed this approach by modifying our devices to work outdoors with free-living wild animals. This will potentially allow perceptual studies to be carried out without requiring animals to be kept in captivity at all. We have again received interest from other groups about our technology, and hope to supply devices and plans for building devices to other labs. This will lead to a considerable impact on animal welfare within the research field. We have developed an algorithm that allows us to compare bird song. This is initially and primarily of use to an academic community. However, we are in the process of integrating this knowledge into the PI's software program, Luscinia, and thanks to parallel work, this program is being developed for non-academic users; in fields of bio-monitoring, and public amateur birdwatchers.
First Year Of Impact	2020
Sector	Environment,Other
Impact Types	Cultural Societal


Description	Exploring the evolution of vocal learning through comparative cultural evolution
Amount	£120,000 (GBP)
Organisation	Evolution Education Trust
Sector	Charity/Non Profit
Country	United Kingdom
Start	08/2022
End	09/2026


Description	Fashion and fads in bird song
Amount	£350,000 (GBP)
Organisation	The Leverhulme Trust
Sector	Charity/Non Profit
Country	United Kingdom
Start	11/2021
End	11/2024


Title	Operant bird feeder
Description	We have further refined an operant bird feeder and have successfully used it in our experiments. The new device provides several benefits from a 3R's perspective. Typically behavioural experiments on learning require animals to be kept in captivity, in isolation. Our new method obviates those needs. It uses PIT tag technology to identify individuals, and low-power/low-cost raspberry pi computers to control the experiment. Sound stimuli are presented, RFID antennae detect individuals and behavioural choices, and a motorised bird feeder controls rewards. Our device functions well without any negative stimuli. We have used our devices in our experiments for animals in aviary contexts, with wild bird outdoors. Our choices for making this device have meant that it is relatively low-cost (<£300 in components per device), allowing it to be widely used.
Type Of Material	Improvements to research infrastructure
Year Produced	2019
Provided To Others?	Yes
Impact	We have been able to conduct our experiment without placing animals (zebra finches, great tits) in isolation. This represents a significant welfare refinement since the time taken for the experiment is considerable (>6 months). We have reached out and have already supplied our devices to other research teams in the field.


Title	Machine Learning for Bird Song Learning (ML4BL) dataset
Description	General description This dataset contains Zebra Finch decisions about perceptual similarity on song units. All the data and files are used for reproducing the results of the paper 'Bird song comparison using deep learning trained from avian perceptual judgments' by the same authors. Git repo on Zenodo: https://doi.org/10.5281/zenodo.5545932 Git repo access: https://github.com/veronicamorfi/ml4bl/tree/v1.0.0 Directory organisation: ML4BL_ZF \|_files \|_Final_probes_20200816.csv - all trials and decisions of the birds (aviary 1 cycle 1 data are removed from experiments) \|_luscinia_triplets_filtered.csv - triplets to use for training \|_mean_std_luscinia_pretraining.pckl - mean and std of luscinia triplets used for trianing \|__cons_ - % side consistency on triplets (train/test) - train set contains both train and val splits \|__gt_ - cycle accuracy for triplets of the specific bird (train/test) - train set contains both train and val splits \|__trials_ - number of decisions made for a triplet (train/test) - train set contains both train and val splits \|__triplets_ - triplet information (aviary_cycle-acc_birdID, POS, NEG, ANC) (train/test) - train set contains both train and val splits \|__low_ - low-margin (ambiguous) triplets (train/val/test) \|__high_ - high-margin (unambiguous) triplets (train/val/test) \|__cycle_bird_keys_* - unique aviary_cycle-acc_birdID keys (train/test) - train set contains both train and val splits \|_TunedLusciniaV1e.csv - pairwise distance of two recordings computed by Luscinia \|_training_setup_1_ordered_acc_single_cons_50_70_trials.pckl - dictionary containing everything needed for training the model (keys: 'train_keys', 'train_triplets', 'val_keys', 'vali_triplets', 'test_triplets', 'test_keys', 'train_mean', 'train_std') \|_melspecs - .pckl - melspectrograms of recordings \|_wavs - wav - recordings \|_README.txt Recordings 887 syllables extracted from zebra finch song recordings, with a sampling rate of 48kHz and high pass filtered (100Hz), with a 20ms intro/outro fade. Decisions Triplets were created from the recordings and the birds made side based decisions about their similarity (see 'Bird song comparison using deep learning trained from avian perceptual judgments' for further information). Training dictionary Information Dictionary keys: 'train_keys', 'train_triplets', 'val_keys', 'vali_triplets', 'test_triplets', 'test_keys', 'train_mean', 'train_std' train_triplets/vali_triplets/test_triplets: Aviary_Cycle_birdID, POS, NEG, ANC, Decisions, Cycle_ACC(%), Consistency(%) train_keys/val_keys/test_keys: Aviary_Cycle_birdID train_mean/train_std: shape: (1, mel_bins) Open Access This dataset is available under a Creative Commons Attribution 4.0 International (CC BY 4.0) license. Contact info Please send any questions about the recordings to: Lies Zandberg: Elisabeth.Zandberg@rhul.ac.uk Please send any feedback or questions about the code and the rest of the data to: Veronica Morfi: g.v.morfi@qmul.ac.uk
Type Of Material	Database/Collection of data
Year Produced	2021
Provided To Others?	Yes
URL	https://zenodo.org/record/5545871


Description	3R's conference
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Professional Practitioners
Results and Impact	Lies Zandberg presented a talk on the operant device that we have developed during the project. The device allows a wide range of behavioural experiments to be carried out within a non-licensed group-living scenario. Previously, work in this area has typically required animals to be kept in isolation, so this development considerably reduces the welfare impact of such experiments. The the talk and conference allowed us to reach out to other practitioners in the field and identify how we can allow other researchers to take advantage of our technology. Conversations were held with 6-7 other attendees about potentially using the technology.
Year(s) Of Engagement Activity	2019


Description	International Bioacoustics Congress talk
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	This was a talk at a scientific conference - but we focused on communicating our new technology that improves animal welfare during behavioural experiments (thus we are not reporting here the scientific impact of our research, but our improvement of practice within the field). Our operant training device allows auditory perception experiments to be carried out in a group setting for the first time. This obviates the need for animals to be held in stressful isolation, and even may be carried out in wild free-living animals. We had multiple people (>10 groups) express interest in the technology, and are in continued contact with two groups about providing equipment to them.
Year(s) Of Engagement Activity	2019


Description	National Geographic Workshop in Communicating Animal Culture
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Media (as a channel to the public)
Results and Impact	I was invited to a National Geographic Workshop at the Max Planck institute for Ornithology on the topic of Communicating Animal Culture: how we can better use animal culture as a tool to increase public knowledge and interest in biodiversity and conservation of populations.
Year(s) Of Engagement Activity	2019


Description	QMUL 3 R's workshop
Form Of Engagement Activity	A formal working group, expert panel or dialogue
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Professional Practitioners
Results and Impact	A poster was presented on our operant bird feeder device. This device allows us to carry out auditory perception studies on birds in an aviary context, obviating the need for placing birds in isolation. It therefore has direct animal welfare benefits, refining methods used in the field. The poster was presented to a university-wide workshop on welfare, and won the prize for the best poster. Other researchers expressed interest in using our approach for their behavioural experiments.
Year(s) Of Engagement Activity	2019


Description	Soapbox Science talk
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Public/other audiences
Results and Impact	A short talk was given online in the soapbox science series by Lies Zandberg, about the work carried out in the grant.
Year(s) Of Engagement Activity	2020