SignGPT: Building Generative Predictive Transformers for Sign Language
Lead Research Organisation:
University of Surrey
Department Name: Vision Speech and Signal Proc CVSSP
Abstract
Globally 70 million deaf people rely on sign language as their primary form of communication. For many deaf people, written languages are their second or third language and they may not be proficient readers. There is no universal sign language and no direct relationship between the sign language of a given country and its spoken language. Sign languages have their own grammars and lexicons and use both manual (hands) and non manual (body and face) articulators combined with the use of space to convey meaning. Sign languages have evolved naturally within deaf communities, and description of the rules that govern them are still an active area of linguistic research. Automatic conversion from a sign language to a spoken language and vice versa is a complex translation problem and one that is currently unsolved, although much of the relevant state of the art originates from the partners in this grant.
Speech recognition is now an everyday consumer technology (e.g. Google Assistant, Alexa and Siri). Furthermore, recent developments in Large Language Models (LLMs) such as Generative Pre-trained Transformers (GPT) have led to new levels of AI epitomised by ChatGPT. Automatic approaches to sign language recognition and production are lagging behind and this proposal seeks to redress that balance.
Our vision for this Programme Grant is to solve the sign language translation problem. This involves developing the AI and machine learning tools needed to translate between signed and spoken languages - to allow spoken language to be automatically translated into photo-realistic sign language, and video of sign language to be translated into spoken language. To support this aim, the Programme Grant will curate the largest sign language dataset in the world and use the dataset to build a sign language GPT model that can provide the breadth of application to the deaf community equivalent to what LLMs have provided for written/spoken language. In doing so the Programme Grant will also generate tools for data annotation that will be released for use by the wider community. This outcome will greatly enhance communication between deaf and hearing people, enabling full access for deaf people in today's information society.
To achieve this we assemble a large multidisciplinary research team across leading authorities in computer vision, sign language linguistics, computational linguistics and machine learning and AI: bringing together the Universities of Surrey, Oxford and UCL alongside leading UK Deaf organizations. We will produce open source toolkits for linguistic use, web based demonstrations for accessible dissemination and run outreach programmes alongside collaborative workshops. Our showcase demonstration will be a fully functioning real time sign language interface to chatGPT allowing a fluent signer to converse naturally with a machine.
Speech recognition is now an everyday consumer technology (e.g. Google Assistant, Alexa and Siri). Furthermore, recent developments in Large Language Models (LLMs) such as Generative Pre-trained Transformers (GPT) have led to new levels of AI epitomised by ChatGPT. Automatic approaches to sign language recognition and production are lagging behind and this proposal seeks to redress that balance.
Our vision for this Programme Grant is to solve the sign language translation problem. This involves developing the AI and machine learning tools needed to translate between signed and spoken languages - to allow spoken language to be automatically translated into photo-realistic sign language, and video of sign language to be translated into spoken language. To support this aim, the Programme Grant will curate the largest sign language dataset in the world and use the dataset to build a sign language GPT model that can provide the breadth of application to the deaf community equivalent to what LLMs have provided for written/spoken language. In doing so the Programme Grant will also generate tools for data annotation that will be released for use by the wider community. This outcome will greatly enhance communication between deaf and hearing people, enabling full access for deaf people in today's information society.
To achieve this we assemble a large multidisciplinary research team across leading authorities in computer vision, sign language linguistics, computational linguistics and machine learning and AI: bringing together the Universities of Surrey, Oxford and UCL alongside leading UK Deaf organizations. We will produce open source toolkits for linguistic use, web based demonstrations for accessible dissemination and run outreach programmes alongside collaborative workshops. Our showcase demonstration will be a fully functioning real time sign language interface to chatGPT allowing a fluent signer to converse naturally with a machine.
Organisations
- University of Surrey (Lead Research Organisation)
- Microsoft - Munich (Project Partner)
- Security Control Systems. (SCS) Ltd (Project Partner)
- GSK (Project Partner)
- Gallaudet University (Project Partner)
- University of Wolverhampton (Project Partner)
- Heriot-Watt University (Project Partner)
- University of Zurich (Project Partner)
- SIGNAPSE LTD (Project Partner)
- British Deaf Association (Project Partner)
- British Sign Language Broadcasting Trust (Project Partner)
- University of Hamburg (Project Partner)
- Royal Association for Deaf people (Project Partner)
- The British Broadcasting Corporation (BBC) (Project Partner)
