CausalXRL: Causal eXplanations in Reinforcement Learning

Lead Research Organisation: University of Sheffield

Department Name: Computer Science

Abstract

Deep reinforcement learning (RL) systems are approaching or surpassing human-level performance in specific domains, from games to decision support to continuous control, albeit in non-critical environments, and usually learning via random explorations. Despite these prodigious achievements, many applications cannot be considered today because we need to understand and explain how these AI systems make their decisions before letting them interact with, and possibly impact on, human beings and society.

There are two main obstacles for AI agents to explain their decisions: they have to be able to provide it at a level human beings can understand, and they have to deal with causal relations rather than statistical correlations. Hence, we believe the key to explainable AI, in particular for decision support, is to build or learn causal models of the system being intervened upon. Thus, instead of standard machine learning and reinforcement learning networks, we will leverage the new science of causal inference to equip deep RL systems with the ability to learn, plan with, and justifiably explore cause-effect relationships in their environment. RL systems based on this novel CausalXRL architecture will provide cause-effect and counterfactual justifications for their suggested actions, allowing them to fulfill the right to an explanation in human-centric environments.

We will implement the CausalXRL architecture as a bio-plausible (neuromorphic) algorithm to enable its deployment in resource-limited, e.g., mobile, environments. We will demonstrate the broad applicability and impact of CausalXRL on several use cases, ranging from neuro-rehabilitation to intensive care, farming and education.

Funded Value:

£247,932

Funded Period:

Jul 21 - Jul 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Research Grant

Project Reference:

EP/V055720/1

Principal Investigator:

Aditya Gilra

Eleni Vasilaki

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (100%)

Organisations

People	ORCID iD
Aditya Gilra (Principal Investigator)	http://orcid.org/0000-0002-8628-1864
Eleni Vasilaki (Principal Investigator)	http://orcid.org/0000-0003-3705-7070
Luca Manneschi (Co-Investigator)

Publications

Author Name

Title Publication Date Published

10 25 50

Kumar A (2023) BunDLe-Net: Neuronal Manifold Learning Meets Behaviour

Manneschi L (2022) Signal neutrality, scalar property, and collapsing boundaries as consequences of a learned multi-timescale strategy. in PLoS computational biology

Reabetswe M. Nkhumise Measuring Exploration in Reinforcement Learning via Optimal Transport in Policy Space

Key Findings
Further Funding
Research Databases and Models
Collaboration
Engagement Activities


Description	CausalXRL aims to interpret how specific machine learning algorithms make decisions, allowing humans to understand the reasons behind the selected outcome. The decision making algorithm first learns a model of the environment, which enables it to simulate possible decision scenarios in advance and explain why a specific one is chosen. The project has generated the following outcomes: - Development of a Methodology for Learning a Transition Model of the Environment: The team succeeded in creating a method to understand and predict environmental changes. This predictive model is structured as a causal-directed acyclic graph, which, in simpler terms, allows us to map out and explain the sequence of events or actions leading from one state to another in a clear, logical order. Such a structure is vital for providing causal explanations in reinforcement learning, a type of artificial intelligence where machines learn to make decisions by trying different strategies to achieve a goal. - Dimensionality Reduction: In partnership with the University of Vienna, the project advanced in reducing the complexity of data from the system under study, with direct application to neural recordings. High-dimensional dynamics involving vast amounts of data points and interactions were transformed into a simpler, lower-dimensional representation without losing essential information. This process ensures that the underlying cause-and-effect relationships remain intact and understandable, which is crucial for making the machine's decision-making process transparent and justifiable. - Method to Measure Exploration in Reinforcement Learning: In collaboration with Inria, Lille, we developed a new approach to quantify how much exploration an artificial intelligence system needs when learning through reinforcement learning compared to more direct learning methods. This work measures how often the system explores new strategies to improve its decision-making capabilities. - Modelling Continuous State-Action-Time Systems: We worked on adapting reinforcement learning for systems where changes occur continuously over time in a continuous state-action space, using advanced neural networks capable of solving differential equations. This work is pivotal for applications requiring real-time decision-making. The project also embarked on developing learning algorithms that mimic biological processes, aiming for more natural and efficient machine-learning methods.
Exploitation Route	The aim of making machine learning algorithms explainable is a challenging endeavour and requires extensive research. The findings and advancements from this project set a foundational framework and pave the way for applications in causal, explainable reinforcement learning. The potential avenues for taking the findings forward include: Future Research and Grants: The strategies, techniques, and algorithms we've developed serve as a robust foundation for future scientific inquiry and innovation in this area. This includes the continuation of development through further funding. Resources for the Global Research Community: By making our research findings and the software we have developed publicly available, we provide tools that other researchers can utilise in applications, particularly in the area of neuroscience, and build upon to tackle this challenging problem.
Sectors	Digital/Communication/Information Technologies (including Software)


Description	EPSRC Doctoral Training Partnership (DTP) - Early Career Researcher Scholarship Award
Amount	£160,000 (GBP)
Organisation	Engineering and Physical Sciences Research Council (EPSRC)
Sector	Public
Country	United Kingdom
Start	01/2022
End	07/2025


Title	BunDLe-Net
Description	Behavioural and Dynamic Learning Network (BunDLe Net) is an algorithm to learn meaningful coarse-grained representations from time-series data. It maps high-dimensional data to low-dimensional space while preserving both dynamical and behavioural information. It has been applied, but is not limited to neuronal manifold learning.
Type Of Material	Computer model/algorithm
Year Produced	2023
Provided To Others?	Yes
Impact	We expect this to develop as a key tool for dimensionality reduction of neural data.
URL	https://github.com/akshey-kumar/BunDLe-Net


Description	Dr Debabrota Basu
Organisation	Inria research centre Lille - Nord Europe
Country	France
Sector	Public
PI Contribution	Sheffield, Gilra, Vasilaki, and Manneschi collaboratively conducted staff training sessions to ensure a high level of competency across the team. Gilra supervised the entire project but also delineated the research, developed a robust methodology, and played a pivotal role in the manuscript's composition.
Collaborator Contribution	Basu participated in research discussions, refined the methodology, and contributed to editing of the paper.
Impact	Nkhumise RM, Basu D, Prescott TJ, Gilra A. Measuring Exploration in Reinforcement Learning via Optimal Transport in Policy Space. arXiv; 2024. doi:10.48550/arXiv.2402.09113
Start Year	2021


Description	Professor Moritz Grosse-Wentrup
Organisation	University of Vienna
Country	Austria
Sector	Academic/University
PI Contribution	Gilra offered expertise in neural network design and neural data analysis. He contributed to research discussions, refined methodology and co-authored a research publication.
Collaborator Contribution	Grosse-Wentrup contributed his expertise in causal modelling, facilitated the training of staff, and provided essential facilities, including lab space. Furthermore, he led the development of the methodology and guided the authoring of the manuscript.
Impact	Kumar A, Gilra A, Gonzalez-Soto M, Meunier A, Grosse-Wentrup M. BunDLe-Net: Neuronal Manifold Learning Meets Behaviour. bioRxiv; 2023. p. 2023.08.08.551978. doi:10.1101/2023.08.08.551978
Start Year	2022


Description	Annual talk presentation on our project CausalXRL to Chist-era projects seminar meetings for running projects funded by Chist-era
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Our grant CausalXRL is funded under the Chist-era call for eXplainable AI (XAI) 2020. Chist-era organizes an yearly meeting of all ongoing funded projects. Acting coordinator Gilra represented our CausalXRL project and gave 10-min talks at each of these annual meetings. Those in 2021 and 2022 were online, while the one in 2023 was in Bratislava, Slovakia. The upcoming 2024 one is in Finland and Gilra will present there as well.
Year(s) Of Engagement Activity	2021,2022,2023,2024
URL	https://www.chistera.eu/projects-seminar-2023-programme


Description	Poster presentation at Cognitive Computational Neuroscience international conference at Oxford UK
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	The co-authors of https://www.biorxiv.org/content/10.1101/2023.08.08.551978v3 including Gilra presented a poster on their work at the Cognitive Computational Neuroscience (CCN) yearly international conference in 2023 held at Oxford, UK. Apart from visibility to various researchers in the field, there were active discussions on our work with some researchers, both at the poster presentation and elsewhere at the conference.
Year(s) Of Engagement Activity	2023
URL	https://2023.ccneuro.website/view_paper6f34.html?PaperNum=1089

Abstract

Organisations

People

ORCID iD

Publications