Multimodal Intention Prediction in Assistive Robotics

Lead Research Organisation: Imperial College London
Department Name: Electrical and Electronic Engineering


Humans display the innate ability to accurately identify the intentions of others simply from perceiving their actions. Two biological characteristics that possibly contribute to this predictive ability are the concepts of multisensory integration and internal models. Multisensory integration refers to our brain's simultaneous processing of an array of sensory inputs, such as visual stimuli, for the purpose of constructing a robust multimodal percept. On the other hand, internal models are neural functions that anticipate the resulting behaviour of perceived actions based on internal simulations of our own sensory-motor repertoire of actions.

Motivated by these neural mechanisms of humans, a multimodal intention prediction framework is proposed for robot-assisted mobility. The framework consists of multiple artificial internal models that are operating in parallel to recognise the motor command of a human, given the interacting robot's multimodal sensory observations. Each of these models captures a representation of the observed behaviour using a probabilistic algorithm for inference and learning, known as variational autoencoding. This algorithm exploits a data-driven approach to fit an approximate latent distribution over any arbitrary input observations, which can subsequently be used at runtime to infer the most probable motor command. Upon internal recognition of this command, the robot then predicts the likely outcome of executing this action according to the kinematic constraints of the mobile base.

To exemplify a use-case of this proposal via an application, the framework can be situated in a robotic wheelchair setting. Robotic wheelchairs have the potential to offer disabled individuals with an augmented means of independence. However, administering such assistance for independent mobility is a complex task, primarily due to the diverse, noisy and unpredictable input signals of patients with severe disability. As a result, a component that can accurately predict the intended path of a user is a vital addition to the development of these wheelchairs. In this context, sensory observations can include the state of the patient and environment, whilst output actions are either joystick commands to issue motor control or designated poses on a map to navigate towards.

In developing this intention prediction framework, this research seeks to provide two novel contributions: an examination of the role of eye gaze as a sensory input to this framework, as well as an investigation into how best to explain the prediction to the interacting person. At present, the tools used to explore these questions on the aforementioned wheelchair application are a wearable eye tracker and an augmented reality headset, respectively. By incorporating eye movements into the robot's intention prediction module and providing graphical cues via augmented reality to inform the user on the inner workings of the artificial intelligence, this PhD project hopes to create a seamless collaboration between the two interacting agents.

This work falls under the following ESPRC research areas:

Artificial intelligence technologies
Assistive technology, rehabilitation and musculoskeletal biomechanics
Graphics and visualisation


10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509486/1 01/10/2016 30/09/2021
1859675 Studentship EP/N509486/1 01/10/2016 31/08/2020 Mark Zolotas