DeepMyna

Lead Participant: HABITAT LEARN LIMITED

Abstract

Habitat Learn Limited (HLL) provides a personalised learning toolkit (HabitatLearn) to improve learning outcomes for students. HabitatLearn is an ?ecosystem? of live closed-captioning, note-taking and mindmaps, enabling students to reach individual learning objectives. It has 20,000+ UK student-users, and 300 HEI partners in UK and North America. DeepMyna builds upon IUK project DeepSpark, an AI toolkit optimising automated speech recognition (ASR) models for higher education.

European Accessibility Act (2019) requires accurate video-captioning by public sector bodies to make lectures accessible for students with specific learning disabilities. Despite advances in ASR, human intervention is still required for accuracy, making costs prohibitive.

The use of AI to convert speech into text is well documented. However, there are significant issues with the current solutions: -

* Accuracy for technical, complex language is not good as most ASR models are built for general conversation; and
* Privacy and Security concerns for confidential material being shared on the cloud.
* Gender and race bias which renders it unusable for certain individuals.
* Costs for human-curated alternatives are too high.

DeepMyna proposes a solution to this problem by developing a framework to include provenance data in ASR training and evaluation, so that any bias can be detected and tracked all through the ASR training process. Current ASR evaluation focuses too much on the general accuracy in a certain language. Our belief is everyone should enjoy ASR service and ASR should be tailored for everyone. So in the fine-tuning of ASR services, we need to be clear on what data has been fed into ASR and evaluate how it will affect the fine-tuned model.

DeepMyna framework will be designed based on HLL's datasets compiled from 300,000 hours of lecture recordings with manually curated transcript and summary notes, where we will extract provenance data for ASR training, such as language, background noise, gender, age, accent, etc. The same provenance data will also be applied to the evaluation matrix, I.e. what is the bias on each domain defined in the provenance data.

HLL has good working relationships with many partners in ASR and transcription companies, and we will firstly select partners to form the consortium and together work on the use cases descriptions and proposed solutions.

HLL will target financial services and Independent Financial advisers(IFAs), medical, legal and commercial markets where accuracy and security of information is often critical.

DeepMyna will broaden HLL's customer base outside of its core education market.

Lead Participant

Project Cost

Grant Offer

HABITAT LEARN LIMITED £48,912 £ 48,912
 

Participant

DATASUMI LTD

Publications

10 25 50