ACE-LP: Augmenting Communication using Environmental Data to drive Language Prediction.

Lead Research Organisation: University of Dundee
Department Name: Computing

Abstract

Communication is the essence of life. We communicate in many ways, but it is our ability to speak which enables us to chat in every-day situations. An estimated quarter of a million people in the UK alone are unable to speak and are at risk of isolation. They depend on Voice Output Communication Aids (VOCAs) to compensate for their disability. However, the current state of the art VOCAs are only able to produce computerised speech at an insufficient rate of 8 to 10 words per minute (wpm). For some users who are unable to use a keyboard, rates are even slower. For example, Professor Stephen Hawking recently doubled his spoken communication rate to 2 wpm by incorporating a more efficient word prediction system and common shortcuts into his VOCA software. Despite three decades of developing VOCAs, face-to-face communication rates remain prohibitively slow. Users seldom go beyond basic needs based utterances as rates remain, at best, 10 times slower than natural speech. Compared to the average of 150-190 wpm for typical speech, aided communication rates make conversation almost impossible.

ACE-LP brings together research expertise in Augmentative and Alternative Communication (AAC) (University of Dundee), Intelligent Interactive Systems (University of Cambridge), and Computer Vision and Image Processing (University of Dundee) to develop a predictive AAC system that will address these prohibitively slow communication rates by introducing the use of multimodal sensor data to inform state of the art language prediction. For the first time a VOCA system will not only predict words and phrases; we aim to provide access to extended conversation by predicting narrative text elements tailored to an ongoing conversation.

In current systems users sometimes pre-store monologue 'talks', but sharing personal experiences (stories) interactively using VOCAs is rare. Being able to relate experience enables us to engage with others and allows us to participate in society. In fact, the bulk of our interaction with others is through the medium of conversational narrative, i.e. sharing personal stories. Several research projects have prototyped ways in which automatically gathered data and language processing can support disabled users to communicate easily and at higher rates. However, none have succeeded in harnessing the potential of such technology to design an integrated communication system which automatically extracts meaningful data from different sources, transforms this into conversational text elements and presents results in such a way that people with severe physical disabilities can manipulate and select conversational items for output through a speech synthesiser quickly and with minimal physical and cognitive effort.

This project will develop technology which will leverage contextual data (e.g. information about location, conversational partners and past conversations) to support language prediction within an onscreen user interface which will adapt depending on the conversational topic, the conversational partner, the conversational setting and the physical ability of the nonspeaking person. Our aim is to improve the communication experience of nonspeaking people by enabling them to tell their stories easily, at more acceptable speeds.

Planned Impact

Communication is fundamental to quality of life. Having a voice enables disabled individuals to direct their lives; it impacts on their mental health and helps in finding employment and remaining in work for longer. Slow communication rates mean that many disabled people are lonely which results in poor quality of life. "The Right to Speak", a 2012 report by the Scottish Government warns that poor communication affects delivery of care and puts people's lives at risk. Although estimates suggest that 0.05% of the UK population could benefit from current VOCAs, recent research by Communication Matters (CM), the largest charity for people who use AAC, has recognised that with changing demographics and improved technology, future generations of VOCAs could be used by an order of magnitude more people (0.5% of the UK population).

This project will target nonspeaking literate people including people with cerebral palsy; locked-in syndrome, motor neuron disease (MND), head and neck cancer, and Parkinson's disease (PD). According to the CM research, the target groups would account for around 25% of AAC users, approximately 315,000 people in the UK. The severity of disability and the progressive nature of some diseases impacts on families, carers, healthcare professionals, friends and all who interact with the AAC user, increasing the beneficiaries significantly.
This is a participatory research project and as such, at least 20 end users, their families, friends and caregivers will have direct benefit from the project. From experience, we know that participants benefit socially and their self-esteem and communication skills improve when involved in a research project.

We will extend our existing AAC Research Group online community to engage interested parties throughout the project to reduce the chasm between innovation and adoption. In addition, we will present workshops at Communication Matters Symposia throughout the project to present our results to nonspeaking people, their families, speech and language therapists and other professionals interested in AAC.

We have a strong track record working with industry to transfer research into commercial products. We will carry out a professionally-run industrial workshop in the form of a technology roadmapping session in year 3 to identify routes to impact. We have identified two possible routes (an open source non-profit organisation and a technology start-up) which will be the starting point for our discussions with University advisors, our Advisory Group and our Industrial partners. Our three industrial partners anticipate that they will benefit from the outcomes of the research, both in terms of AAC technology and within mainstream surveillance data analysis.

Publicity features in newspapers, radio and television will be used to raise the public awareness of the abilities and needs of people who use AAC; building on our success in obtaining media publicity for our previous projects. We will also collaborate with the Scottish National Museum to create an installation as part of their new Communications exhibition.
 
Title Multimodal Focused Interaction Dataset 
Description Recording of daily life experiences from a first-person perspective has become more prevalent with the increasing availability of wearable cameras and sensors. This dataset was captured during development of a system for automatic detection of social interactions in such data streams, and in particular focused interactions in which co-present individuals, having mutual focus of attention, interact by establishing face-to-face engagement and direct conversation. Existing public datasets for social interaction captured from first person perspective tend to be limited in terms of duration, number of people appearing, continuity and variability of the recording. We contribute the Focused Interaction Dataset which includes video acquired using a shoulder-mounted GoPro Hero 4 camera, as well as inertial sensor data and GPS data, and output from a voice activity detector. The dataset contains 377 minutes (including 566,000 video frames) of continuous multimodal recording captured during 19 sessions, with 17 conversational partners in 18 different indoor/outdoor locations. The sessions include periods in which the camera wearer is engaged in focused interactions, in unfocused interactions, and in no interaction. Annotations are provided for all focused and unfocused interactions for the complete duration of the dataset. Anonymised IDs for 13 people involved in the focused interactions are also provided. In addition to development of social interaction analysis, the dataset may be useful for applications such as activity detection, personal location of interest understanding, and person association. 
Type Of Art Film/Video/Animation 
Year Produced 2018 
URL https://discovery.dundee.ac.uk/en/datasets/multimodal-focused-interaction-dataset
 
Description Augmentative and alternative communication (AAC) can provide access to computerised speech output for adults who have little or no speech due to disabilities. These disabilities can be congenital, such as cerebral palsy, or acquired, such as after a stroke. Computer based speech generating AAC devices are well suited to communicate needs and wants (such as I am thirsty). However, they do not well support more complex interactions such as conversational narrative (guess what happened to me today) and social dialogue (e.g., pub chats about football). These interactions form an essential part of social relationships. Indeed, social isolation is a major quality-of-life issue amongst people with communication impairment. Our goal is to develop AAC tools that support storytelling and social dialogue.

This project builds on our previous work where we support personal narrative using non-written media in our research prototypes, i.e. using symbol based systems.

In this project, we concentrate on written based communication for literate users such as e.g. the Prof Stephen Hawking. Despite all efforts his communication rate remains low at 2 words per minute. We aim to support the use of predictive text by using egocentric video data to inform the prediction algorithms, extend prediction from single words to predicting whole stories and facilitating access to this predicted content by improving the user interface. We will evaluate our new system with several literate adults with complex disabilities.
Exploitation Route Workshops with software developers have provided information exchange in ways to support narrative using voice output communication devices. Developers are able to use our research to support their development of new software and user interfaces for people with complex disabilities.
Sectors Communities and Social Services/Policy,Digital/Communication/Information Technologies (including Software),Education,Healthcare

URL https://ace-lp.ac.uk
 
Description Adults who use augmentative and alternative communication (AAC), their carers, therapists, AAC companies, and the general public benefited from the proposed research as follows: Adults with complex communication needs (CCN) are the main beneficiaries of our work, because it enhances their communication and interaction. Enhanced communication and interaction reduces social isolation and enhances self-esteem, thereby leading to a higher quality of life. We engaged with this group by establish a user pool of people who use AAC or interact closely with people who uses AAC (e.g. carers). We ran several focus groups where we explored possibilities of improving AAC devices to support personal narrative. AAC companies: Leading AAC companies, such as TObiiDynaVox and Smartbox are partners in the project and provided us with equipment. We explore ways to implement the our results of our research into new systems and run workshop with the developers to facilitate the transfer process. TobiiDynaVox was already incorporated previous research findings from Dundee in its InterAACt system (then as DynaVox).
First Year Of Impact 2017
Sector Digital/Communication/Information Technologies (including Software),Healthcare
Impact Types Societal

 
Title Multimodal Focused Interaction Dataset 
Description Recording of daily life experiences from a first-person perspective has become more prevalent with the increasing availability of wearable cameras and sensors. This dataset was captured during development of a system for automatic detection of social interactions in such data streams, and in particular focused interactions in which co-present individuals, having mutual focus of attention, interact by establishing face-to-face engagement and direct conversation. Existing public datasets for social interaction captured from first person perspective tend to be limited in terms of duration, number of people appearing, continuity and variability of the recording. We contribute the Focused Interaction Dataset which includes video acquired using a shoulder-mounted GoPro Hero 4 camera, as well as inertial sensor data and GPS data, and output from a voice activity detector. The dataset contains 377 minutes (including 566,000 video frames) of continuous multimodal recording captured during 19 sessions, with 17 conversational partners in 18 different indoor/outdoor locations. The sessions include periods in which the camera wearer is engaged in focused interactions, in unfocused interactions, and in no interaction. Annotations are provided for all focused and unfocused interactions for the complete duration of the dataset. Anonymised IDs for 13 people involved in the focused interactions are also provided. In addition to development of social interaction analysis, the dataset may be useful for applications such as activity detection, personal location of interest understanding, and person association. 
Type Of Material Database/Collection of data 
Year Produced 2018 
Provided To Others? Yes  
Impact We know of no non-academic impact as yet. 
URL https://discovery.dundee.ac.uk/en/datasets/multimodal-focused-interaction-dataset
 
Title QMB Morning Dataset 
Description The proliferation of wearable cameras has accelerated and facilitated a surge in research on analysis of egocentric videos. This relatively new field has relatively few research datasets. A significant share of publicly available egocentric datasets is purposely acquired for activity recognition, video summarization, object detection, and behavioural understanding. Here we share our dataset that is focused on personal localization and mapping. We collected our dataset on the university campus, documenting a user's typical morning. The recording is always initiated at the building entrance, at which point the user enters and triggers the session. The dataset was collected indoors before working hours to ensure people were not mistakenly captured, avoiding potential privacy challenges. To further guard ourselves, we recorded the dataset exclusively in an indoor environment to limit strangers' appearance in the field of view. The five sessions cover a one-month period. During recording, the user's operation was in line with a loosely worded script to ensure that multiple visits were made to a range of locations. In total, we captured nine distinct stations: Entrance, 3d-lab, Kitchen 1, Kitchen 2, Cafe-area, Lab, Printer 1, Printer 2 and Office. 
Type Of Material Database/Collection of data 
Year Produced 2021 
Provided To Others? Yes  
URL https://discovery.dundee.ac.uk/en/datasets/qmb-morning-dataset
 
Title Research data supporting "Fast and Precise Touch-Based Text Entry for Head-Mounted Augmented Reality with Variable Occlusion" 
Description Participant performance data corresponding to experiments described in "Fast and Precise Touch-Based Text Entry for Head-Mounted Augmented Reality with Variable Occlusion". File contains separate sheets for each experiment and the Validation Study. Index page in file summarises experiment conditions and results reported. See publication for detailed descriptions of conditions and metrics. 
Type Of Material Database/Collection of data 
Year Produced 2019 
Provided To Others? Yes  
 
Description Arria NLG Ltd. 
Organisation Arria NLG
Country United States 
Sector Private 
PI Contribution N/A
Collaborator Contribution Membership of advisory panel providing expert advice and guidance.
Impact N/A
Start Year 2016
 
Description Capability Scotland 
Organisation Capability Scotland
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution Delivered staff training to support service enhancement.
Collaborator Contribution Membership of advisory panel providing expert advice and guidance. Access to research participants.
Impact N/A
Start Year 2006
 
Description Communication Matters 
Organisation Communication Matters
Country United Kingdom 
Sector Charity/Non Profit 
Start Year 2008
 
Description NHS Scotland 
Organisation Ninewells Hospital
Country United Kingdom 
Sector Hospitals 
PI Contribution N/a
Collaborator Contribution Membership of advisory panel providing expert advice and guidance. Access to research participants.
Impact N/a
 
Description Public Engagement at the National Museum of Scotland in Edinburgh 
Organisation National Museums Scotland
Country United Kingdom 
Sector Public 
PI Contribution Running PE events at the NMS
Collaborator Contribution Providing venue, training, knowledge transfer and staff time.
Impact Events: Museum Late (Later opening of the museum with Edinburgh Fringe Festival previews) - running interactive Assistive Technology (AT) PE activity. Science Saturday (Family event at the museum) - running interactive AT and Augmentative and Alternative Communication (AAC) PE activities.
Start Year 2016
 
Description Scope 
Organisation Scope
Country United Kingdom 
Sector Charity/Non Profit 
PI Contribution N/A
Collaborator Contribution Membership of advisory panel providing expert advice and guidance.
Impact N/a
Start Year 2016
 
Description Smartbox 
Organisation Smartbox
Country United Kingdom 
Sector Private 
PI Contribution Presentation of research outcomes to be incorporated in future technology.
Collaborator Contribution Membership of advisory panel providing expert advice and guidance.
Impact N/a
Start Year 2016
 
Description Tobii Dynavox 
Organisation Tobii
Country United Kingdom 
Sector Private 
PI Contribution Knowledge transfer.
Collaborator Contribution Membership of advisory panel providing expert advice and guidance.
Impact N/a
Start Year 2012
 
Description BBC Click at V&A Dundee 
Form Of Engagement Activity A broadcast e.g. TV/radio/film/podcast (other than news/press)
Part Of Official Scheme? No
Geographic Reach International
Primary Audience Media (as a channel to the public)
Results and Impact The ACE-LP Project was featured on BBC Click demonstrating all aspects of the project.
Year(s) Of Engagement Activity 2019
URL https://www.bbc.co.uk/iplayer/episode/m000cwsq/click-click-live
 
Description Hackathon with Industrial Partner 
Form Of Engagement Activity A formal working group, expert panel or dialogue
Part Of Official Scheme? No
Geographic Reach National
Primary Audience Industry/Business
Results and Impact Industrial partner held a company wide hackathon to support tech transfer of project outcomes of ACE-LP.
Year(s) Of Engagement Activity 2019
 
Description Museum Late (NMS) 
Form Of Engagement Activity Participation in an activity, workshop or similar
Part Of Official Scheme? No
Geographic Reach Regional
Primary Audience Public/other audiences
Results and Impact Museum Late at the National Museum is a reoccurring event when the museum opens late and present entertainment additionally to its exhibits. More than 1000 people attended for the event where we showcased assistive technology in an interactive session: Paining with your eyes. Visitors were able to use eye gaze to access a computer and draw a pictures using eye movement. We informed visitors during the event about the technology and its use for people with physical disabilities, e.g. MND. More than 100 people used the opportunity and engaged with the research team.
Year(s) Of Engagement Activity 2017