UKRI Centre for Doctoral Training in Speech and Language Technologies and their Applications

Lead Research Organisation: University of Sheffield
Department Name: Computer Science

Abstract

A long term goal of Artificial Intelligence (AI) has been to create machines that can understand spoken and written human language. This capability would enable, for example, spoken language interaction between people and computers, translation between all human languages and tools to analyse and answer questions about vast archives of text and speech. Spectacular advances in computer hardware and software over the last two decades mean this vision is no longer science fiction but is turning into reality. Speech and Language Technologies (SLTs) are now established as core scientific/engineering disciplines within AI and have grown into a world-wide multi-billion dollar industry, with SLT global revenues predicted to rise from $33bn in 2015 to $127bn by 2024. The UK has long played a leading role in SLT and the government has recently identified AI, including SLT, as of national importance. Many international corporations such as Google, Apple, Amazon and Microsoft now have research labs in the UK, in part to leverage local SLT expertise, and a new and extensive eco-system of SLT SMEs has sprung up. There is huge demand for scientists with advanced training in SLT from these organisations, most of whom hire only at PhD level, evident in the support for this CDT by more than 30 partners. The result is fierce, international competition to attract talent and supply is falling far short of demand. It is critically important, therefore, to improve the UK's capacity to address this industrial need for high quality, high value postdoctoral SLT talent, to enhance the UK's position as a leader in the field and, in turn, attract investment in AI-related technologies and support UK economic growth.

To address the shortfall in PhD-trained scientists we propose a CDT in "Speech and Language Technologies and Their Applications". Our vision is to create a CDT that will be a world-leading centre for training SLT scientists and engineers, giving students the best possible advanced training in the theory and application of computational speech and language processing, in a setting that fosters interdisciplinary approaches, innovation and engagement with real world users and awareness of the social and ethical consequences of our work. A cohort-based approach is necessary in SLT because: (1) the software infrastructure, tools and methods for SLT are highly complex and creating them is nearly always a collaborative endeavour -- a cohort offers an ideal setting to gain experience of such collaborative working (2) PhD topics tend to be narrow and focused on specifics and do not include the broad overview needed in students' later careers -- through cohort training we can expose students to a range of different SLT topics (3) peer learning within and across cohorts is a highly effective way to hand over tools and to teach methodology (4) a multi-year cohort programme allows significant and sustained progression in larger (i.e. multi-student) SLT projects, resulting in better research outcomes and more impact in partnering companies (5) cohort teaching is very attractive to students (6) an extended cohort-based training programme with strong group work and peer tutoring elements allows students with non-standard backgrounds be admitted, helping to promote diversity in SLT.

To realise our vision we propose to build on Sheffield's unique strengths in SLT, which include (1) a large team of SLT academics with an outstanding, 30-year research track record in publication, research grant capture and PhD supervision, covering all the core areas of SLT (2) a large group of industrial partners who actively want to participate in the CDT (3) a track record of impact arising from our research, through creating new enterprises or enhancing the activities of existing organisations (4) an excellent research environment in terms of computing and data resources, study and work facilities, and commitment to and respect for diversity and equality.

Planned Impact

There is huge demand for scientists with advanced training in SLT from both large corporations and SMEs, most of whom hire only at PhD level and are in fierce, international competition to attract talent: supply is falling far short of demand. The Centre for Doctoral Training (CDT) in Speech and Language Technologies and their Applications aims to have impact first and foremost by addressing the industrial need for high quality, high value postdoctoral SLT talent by training at least 60 PhD students in a wide range of SLT subjects. These students, through their PhD research outputs and subsequent careers in academia or industry will enhance the UK's position as a leader in this field which will, in turn, attract investment in AI-related technologies and strongly support UK economic growth in this area. The CDT fits the "AI and Data-Driven Economy" grand challenge of the Government's Industry Strategy Challenge Fund and thus supports the associated beneficiaries. The CDT training programme is designed to address needs expressed by a consortium of leading SLT companies as well as SMEs, and health and governmental organisations.

This CDT aims to have impact through four core activities:

(1) Training engineers and researchers: Through cohort-based multidisciplinary, application-oriented and industry-aware training we will educate the next generation of highly skilled and competitive engineers who will in addition have entrepreneurial skills. These SLT experts will be used to working in diverse environments, understand societal impact of their work and be able to engage with the public and the media. This will benefit existing organisations as well as the UK economy through fostering a strong startup culture. Aside from the growing need in industry there is also a growing need for SLT postdoctoral researchers in university-led research that can bring further advances to the sector.

(2) SLT community building: By bringing together leading SLT industry, application experts and students we aim to form a long lasting SLT community in the UK, for the benefit of all parties involved. In particular we aim to foster improved interrelations between speech and language subject areas.

(3) Research: Work conducted in the SLT will be of highest standard while being relevant to the real world. This will benefit academic research in core subject areas (computer science, mathematics, electrical engineering, psychology, linguistics) through scientific results, software, data sets, and publications. It will also benefit core EPSRC target areas by working on: (a) applications in the areas of improving healthcare (e.g. assistive technology, clinical applications of speech technology, analysing medical forums for newly reported drug side effects, problems with healthcare provision, misinformation); (b) SLT applications that support a safe and trusted society (e.g. analysing social media for hate speech, political abuse, terrorist and criminal activities, "fake news" and disinformation) and thereby supporting policy makers, government organisations, media, and political scientists in studying (e.g., barriers to inclusion, influence of alternative media on the democratic processes and society); and (c) SLT applications that boost productivity via robotics, IoT, and big data analytics.

(4) Public engagement: CDT students will engage with the public in a range of activities, though presentations of their work in non-academic venues (schools, business conferences) and events open to the general public. The centre will also engage with industry and public bodies through a student-run SLT consultancy service (the CDT Hub) and specific public engagement projects.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/S023062/1 01/04/2019 30/09/2027
2268963 Studentship EP/S023062/1 30/09/2019 29/09/2023 Megan Charlotte Thomas
2269080 Studentship EP/S023062/1 30/09/2019 29/09/2023 Peter George Vickers
2268789 Studentship EP/S023062/1 30/09/2019 29/09/2023 Thomas Alexander Green
2268211 Studentship EP/S023062/1 30/09/2019 29/09/2023 Hussein S Yusufali
2268920 Studentship EP/S023062/1 30/09/2019 29/09/2023 Sebastian Tate Vincent
2269013 Studentship EP/S023062/1 30/09/2019 29/09/2023 Claudia Grace Haworth
2268977 Studentship EP/S023062/1 30/09/2019 29/09/2023 Joseph William Ravenscroft
2279566 Studentship EP/S023062/1 30/09/2019 29/09/2023 Danae Sanchez Villegas
2429261 Studentship EP/S023062/1 30/09/2020 29/09/2024 Jonathan Clayton
2431571 Studentship EP/S023062/1 30/09/2020 29/09/2024 Samuel Hollands
2430204 Studentship EP/S023062/1 30/09/2020 29/09/2024 Edward Gow-Smith
2431591 Studentship EP/S023062/1 30/09/2020 29/09/2024 Rhiannon Mogridge
2431584 Studentship EP/S023062/1 30/09/2020 29/09/2024 Guanyu Huang
2430187 Studentship EP/S023062/1 30/09/2020 29/09/2024 Tomas Goldsack
2429251 Studentship EP/S023062/1 30/09/2020 29/09/2024 Ahmed Alajrami
2431612 Studentship EP/S023062/1 30/09/2020 29/09/2024 Kyle Reed
2429310 Studentship EP/S023062/1 30/09/2020 29/09/2024 George Close
2431698 Studentship EP/S023062/1 30/09/2020 29/09/2024 Melissa Thong
2431622 Studentship EP/S023062/1 30/09/2020 29/09/2024 Joshua Smith