Fake speech: Generating and detecting convincing voice impersonations by humans and machines
Lead Research Organisation:
University of Cambridge
Department Name: Linguistics
Abstract
This doctoral project aims to target the criminal use of deepfake technology in the audio domain, namely fake speech
(i.e. impersonated speech). While deepfake applications are becoming more well-known to the public in recent years, the
general perception of deepfake still relates to visual impersonation, i.e. digitally transplanting an individual's (typically a
famous person) face on top of another person's face, and often with considerable success as deepfaked images and
videos are becoming more convincing. However, although the domain of fake speech is less well-known to the general
public, it nevertheless presents real problems and consequences to society. Hence, this project will tackle the problem of
fake speech through a four-way permutation: i) Humans detecting fake speech generated by humans ii) Humans
detecting fake speech generated by machines iii) Machines detecting fake speech generated by humans iv) Machines
detecting fake speech generated be machines By improving the rate of fake speech detection in each of the four
scenarios, it is anticipated that the results arising from this project can form a part of an overall eclectic effort to combat
fake speech.
(i.e. impersonated speech). While deepfake applications are becoming more well-known to the public in recent years, the
general perception of deepfake still relates to visual impersonation, i.e. digitally transplanting an individual's (typically a
famous person) face on top of another person's face, and often with considerable success as deepfaked images and
videos are becoming more convincing. However, although the domain of fake speech is less well-known to the general
public, it nevertheless presents real problems and consequences to society. Hence, this project will tackle the problem of
fake speech through a four-way permutation: i) Humans detecting fake speech generated by humans ii) Humans
detecting fake speech generated by machines iii) Machines detecting fake speech generated by humans iv) Machines
detecting fake speech generated be machines By improving the rate of fake speech detection in each of the four
scenarios, it is anticipated that the results arising from this project can form a part of an overall eclectic effort to combat
fake speech.
Organisations
People |
ORCID iD |
Kirsty McDougall (Primary Supervisor) | |
Daniel Lee (Student) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
ES/P000738/1 | 01/10/2017 | 30/09/2027 | |||
2751223 | Studentship | ES/P000738/1 | 01/10/2022 | 30/09/2025 | Daniel Lee |