Affective Mechanisms for Modelling Lifelong Human-Robot Relationships

Lead Research Organisation: University of Cambridge

Department Name: Computer Science and Technology

Abstract

As robots become an integral part of human life, it is important that they are equipped with enhanced interaction capabilities. Human-Robot Interaction (HRI) research for Social Robots has thus gained momentum with researchers focussing on making these interactions as smooth and natural as possible. It is important for robots to become natural extensions to their human environment, allowing them to hold extended interactions with users, repeatedly. Emotional Intelligence (EIQ) is central to human-human interactions, adding meaning and context. EIQ is therefore indispensable for naturalistic and engaging human-robot interactions, enabling robots to adapt their responses and provide their users with personalised interaction experiences.

Although most of the current HRI studies embed emotion recognition capabilities in robots, they rely on frame-based absolute annotations. These are limited to a handful of disjointed emotional categories such as anger, happiness or sadness, with little to no overlap amongst them. This broad generalisation with respect to emotions seems counter-intuitive when we look at how humans interact with each other and express emotions. Human emotions develop over time and vary with individuals, interaction partners or environments. It is thus beneficial to adopt a continuous view of emotions which allows us to map the valence (the positive or negative nature of an emotion), as well as its intensity, providing smoother transitions. It is also important to model emotions in a developmental and evolving manner where a series of evaluations over time yield a robust model of the affective context in an interaction. This emotional understanding will enable robots to form intrinsic affective responses as an evaluation of its state in the interaction. Based on these evaluations, it shall learn to interact with users while performing different tasks under various HRI scenarios.

To address these open questions, this research will focus on modelling long-term relationships between humans and companion robots using deep and hybrid neural architectures. Multi-modal emotion perception techniques will be devised combining multiple modalities such as vision and speech. The robot shall use this perception to incrementally learn the emotional context of its interaction with users by monitoring their responses. Evolving neural representations that model short-term as well as the long-term impact of such an affective interaction shall form the basis for learning optimal behaviour under different environmental conditions. This understanding shall also develop as the robot interacts with different users, generalising its learning in the process. New reinforcement learning mechanisms shall be investigated to achieve lifelong adaptation of robot behaviour in various HRI contexts. Interacting with different user groups, the robot shall learn to assist/coach them in performing complex cognitive tasks such as playing collaborative/competitive games for cognitive training while focusing on their mental health and cognitive development.

This research, aligning itself with the primary supervisor's EPSRC grant Adaptive Robotic EQ for Well-being (ARoEQ), aims to develop a holistic and autonomous system for emotional understanding and robot behaviour modelling, attempting to move away from Wizard-of-Oz approaches. Equipping companion robots with such an affective understanding will enable them to engage users in cognitive tasks using affective interaction capabilities. Inspired by the central principles of affective computing, this PhD project shall (i) bridge the gap between feature-dependent computational models and the deeper psychological and cognitive understanding of human factors and (ii) build a holistic model for actualising affective behaviour in robots for assisting humans.

Student:

Nikhil Churamani

Period of Study:

Oct 18 - Jun 22

Funder:

NERC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2107412

Research Topic:

Unclassified

Organisations

People	ORCID iD
Hatice Gunes (Primary Supervisor)
Nikhil Churamani (Student)

Publications

Author Name Title

Publication Date Published

10 25 50

Churamani N (2022) Affect-Driven Learning of Robot Behaviour for Collaborative Human-Robot Interactions in Frontiers in Robotics and AI

Churamani N (2021) AULA-Caps: Lifecycle-Aware Capsule Networks for Spatio-Temporal Analysis of Facial Actions

Churamani N (2020) CLIFER: Continual Learning with Imagination for Facial Expression Recognition

Churamani N (2020) Continual Learning for Affective Computing

Churamani N (2020) Continual Learning for Affective Robotics: Why, What and How?

McQuillin E (2022) Learning Socially Appropriate Robo-waiter Behaviours through Real-time User Feedback

Bodala I (2021) Teleoperated Robot Coaching for Mindfulness Training: A Longitudinal Study

Kara O (2021) Towards Fair Affective Robotics: Continual Learning for Mitigating Bias in Facial Expression and Action Unit Recognition

Key Findings
Collaboration
Engagement Activities


Description	This research investigates Continual or Lifelong Learning (CL) as a learning paradigm for affective robots to develop long-term Human-Robot Interaction (HRI) solutions. This requires not only learning to be sensitive towards individual user behaviour but also adapting robot responses to learn complement such personalisation. Towards this aim, our first work focussed on developing a novel CL-based learning framework (CLIFER) for Facial Expression Recognition that enables person-specific learning and adaptation. To the best of our knowledge, the proposed framework was the first implementation of the continual learning paradigm for facial expression recognition. The framework was able to incrementally learn to recognise different expressions for each subject in the dataset. Our key findings from this work were as follows: 1. CL offers an effective learning paradigm to develop personalised facial expression recognition systems. The proposed framework is able to learn to recognise facial expression, one expression class at a time, without forgetting previously learnt expressions. 2. Simulating synthetic person-specific data using a generative model greatly enhances the model's performance on both new and previously learnt expression classes. 3. The order in which the model learns the expression classes has a significant impact on model performance. Starting to recognise neutral (no expression) faces first and then learning different expressions enhances the model's performance. One of the key challenges identified from our experience working on the CLIFER framework was the lack of formal definitions that could be leveraged to implement and evaluate CL solutions for affective robots. Thus, in our second work, we focused on providing a formalisation of CL as a learning paradigm for affective robotics research. We reformulated personalisation towards individual affective expression and context-specific behavioural learning in robots as CL problems. We proposed a theoretical framework that can allow affective robots to continually learn and adapt as they interact with the users. Such lifelong learning is essential for modelling long-term human-robot interactions where agents learn and adapt towards individual users based on their preferences. Our key findings from this work were as follows: 1. Modelling user-specific attributes such as user preferences, contextual attributions or personality-specific traits can allow models to form semantic groupings of users and learn appropriate robot behaviours. 2. Conducting interactions with the robot under context-neutral settings during introductions allows the robot form normative baselines. These baselines can act as anchors for user behaviour and enable the robot to easily sense deviations from this norm. 3. Real-world application of CL solutions entails a memory vs. compute trade-off. To balance this, robots can make use of cloud-based solutions or utilise forgetting mechanisms on unused memory locations or parts of the model. Another challenge for machine learning-based models for affect perception is handling biases. Fair and unbiased analysis and interpretation of human affective behaviour are among the factors that can contribute to the realisation of effective long-term HRI. Successful long-term HRI can be used to provide physical and emotional support to the users, engaging them in a variety of application domains that require personalised human-robot interaction, including healthcare, education and entertainment. Yet, as training FER models becomes heavily data-dependent, it may be prone to biases originating from imbalances in the data distribution with respect to attributes like gender, race, age or skin-color, implicitly encoded in the data. To address this, in our next work, we proposed the novel application CL, benefiting from its ability for lifelong learning, to develop fairer FER models for affective robots that can balance learning with respect to different attributes of gender and race. The key outcomes from this work were: 1. Learning to recognise facial expressions, one domain (gender or race) group at a time, applying continual learning principles enhances fairness in the models. 2. Continual Learning models tend to focus on balancing learning across different domain groups, rather than striving for high accuracy scores for one group at the cost of the other. This helps mitigate the effects of biases arising from skewed training data distributions. Most facial affect perception models rely on evaluating static image frames, encoding a snapshot of heightened facial activity. In real-world interactions, however, facial expressions are more subtle and evolve over time requiring models to evaluate how facial expressions are formed, over time, incorporating temporal information. In our next work, we focused on combining spatial (image-based) and patio-temporal (sequence-based) analysis to encode the temporal evolution of facial expressions. The key outcomes from this work were: 1. During apex frames, where facial expressions are fully formed at peak intensity, paying more attention to only relative spatial changes within frames provides relevant information to understand the facial expressions. 2. When the expressions is in the onset (beginning to form from neutral) or offset (returning back to neutral from peak intensity) phases, it is more important to focus on temporal changes in frames, understanding which parts of the face move and how. 3. Combining both spatial and spatio-temporal features and learning when to selectively focus on these features improves the ability of facial affect perception models to capture subtle changes in facial expressions.
Exploitation Route	The outcomes of this project can enable development of fully autonomous affective robots that are able to sense and adapt to user behaviour, engaging the users in long-term interactions. Such learning capabilities can help develop companion robots to be used in education or healthcare domains. For the education domain, continually learning robots can act as instructors or coaches that learn and adapt to the needs of their pupils, sensing their affective responses to provide assistance when necessary. For the healthcare domain, such robots can help promote well-being amongst individuals by acting as companions that interact with individuals on a daily basis, caring for them and providing daily assistance.
Sectors	Digital/Communication/Information Technologies (including Software),Education,Healthcare,Other


Description	Theoretical Framework for Continual Learning for Affective Robotics: Why, What and How?
Organisation	Middle East Technical University
Country	Turkey
Sector	Academic/University
PI Contribution	Developed and published a theoretical framework for the application of Continual Learning for Affective Robotics. We contributed towards the conception and development of the framework as well as primarily contributed towards the manuscript resulting from this work.
Collaborator Contribution	We collaborated with Dr Sinan Kalkan from METU, Turkey for the development of a theoretical framework for the application of Continual Learning for Affective Robotics. Dr Kalkan provided his expertise on Continual Learning, computer vision and robotics that helped shape the theoretical framework. He also contributed to the writing of the resulting manuscript in a secondary role.
Impact	As as result of this collaboration, a conference article was published describing the theoretical framework developed with Dr Kalkan. The details of the framework can be found here: https://doi.org/10.1109/RO-MAN47096.2020.9223564
Start Year	2019


Description	McMenemy Seminar talk at Trinity Hall, Cambridge
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	McMenemy seminars at Trinity Hall, Cambridge bring together students from different departments and encourage different perspectives on research projects. The talk was attended by 20+ Postgraduate students (including some Faculty members) including active discussions on the real-world applicability of the research.
Year(s) Of Engagement Activity	2019


Description	Short Presentation at the Department
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	Local
Primary Audience	Postgraduate students
Results and Impact	As a part of the one-minute research presentations, I made an `elevator-pitch' style presentations about the goals of my research to the postgraduate students and faculty at my department.
Year(s) Of Engagement Activity	2019

Abstract

Organisations

People

ORCID iD

Publications