Large scale neural network language models (Cantab Research)

Lead Participant: CANTAB RESEARCH LIMITED

Abstract

Modelling natural language is important in very many domains, for example text entry on
smartphones, automatic speech recognition and automatic translation. Many of these are
now services commonplace although their performance is far from perfect. This project will create new algorithms to dramatically improve the performance of these technologies.
The project will bring together:
* recent advances in statistical modelling
* the ease of building teraflop scale computers with Graphics Processor Units
* large scale data collection from the web
* recent small scale use of machine learning in language modelling
* the proposers 20 year experience in this area and
* specific basic research developed by Cantab that has yet to be experimentally verified on a commercial scale
to dramatically improve the state-of-the-art in statistical language modelling. The primary
result for Cantab Research will be significantly better automatic speech recognition and
licensing revenue for other domains.

Lead Participant

Project Cost

Grant Offer

CANTAB RESEARCH LIMITED £167,600 £ 100,000
 

Participant

THE TECHNOLOGY STRATEGY BOARD

Publications

10 25 50