CausalXRL: Causal eXplanations in Reinforcement Learning
Lead Research Organisation:
University of Sheffield
Department Name: Computer Science
Abstract
Deep reinforcement learning (RL) systems are approaching or surpassing human-level performance in specific domains, from games to decision support to continuous control, albeit in non-critical environments, and usually learning via random explorations. Despite these prodigious achievements, many applications cannot be considered today because we need to understand and explain how these AI systems make their decisions before letting them interact with, and possibly impact on, human beings and society.
There are two main obstacles for AI agents to explain their decisions: they have to be able to provide it at a level human beings can understand, and they have to deal with causal relations rather than statistical correlations. Hence, we believe the key to explainable AI, in particular for decision support, is to build or learn causal models of the system being intervened upon. Thus, instead of standard machine learning and reinforcement learning networks, we will leverage the new science of causal inference to equip deep RL systems with the ability to learn, plan with, and justifiably explore cause-effect relationships in their environment. RL systems based on this novel CausalXRL architecture will provide cause-effect and counterfactual justifications for their suggested actions, allowing them to fulfill the right to an explanation in human-centric environments.
We will implement the CausalXRL architecture as a bio-plausible (neuromorphic) algorithm to enable its deployment in resource-limited, e.g., mobile, environments. We will demonstrate the broad applicability and impact of CausalXRL on several use cases, ranging from neuro-rehabilitation to intensive care, farming and education.
There are two main obstacles for AI agents to explain their decisions: they have to be able to provide it at a level human beings can understand, and they have to deal with causal relations rather than statistical correlations. Hence, we believe the key to explainable AI, in particular for decision support, is to build or learn causal models of the system being intervened upon. Thus, instead of standard machine learning and reinforcement learning networks, we will leverage the new science of causal inference to equip deep RL systems with the ability to learn, plan with, and justifiably explore cause-effect relationships in their environment. RL systems based on this novel CausalXRL architecture will provide cause-effect and counterfactual justifications for their suggested actions, allowing them to fulfill the right to an explanation in human-centric environments.
We will implement the CausalXRL architecture as a bio-plausible (neuromorphic) algorithm to enable its deployment in resource-limited, e.g., mobile, environments. We will demonstrate the broad applicability and impact of CausalXRL on several use cases, ranging from neuro-rehabilitation to intensive care, farming and education.
Publications
Kumar A
(2023)
BunDLe-Net: Neuronal Manifold Learning Meets Behaviour
Manneschi L
(2022)
Signal neutrality, scalar property, and collapsing boundaries as consequences of a learned multi-timescale strategy.
in PLoS computational biology
Reabetswe M. Nkhumise
Measuring Exploration in Reinforcement Learning via Optimal Transport in Policy Space
Description | CausalXRL aims to interpret how specific machine learning algorithms make decisions, allowing humans to understand the reasons behind the selected outcome. The decision making algorithm first learns a model of the environment, which enables it to simulate possible decision scenarios in advance and explain why a specific one is chosen. The project has generated the following outcomes: - Development of a Methodology for Learning a Transition Model of the Environment: The team succeeded in creating a method to understand and predict environmental changes. This predictive model is structured as a causal-directed acyclic graph, which, in simpler terms, allows us to map out and explain the sequence of events or actions leading from one state to another in a clear, logical order. Such a structure is vital for providing causal explanations in reinforcement learning, a type of artificial intelligence where machines learn to make decisions by trying different strategies to achieve a goal. - Dimensionality Reduction: In partnership with the University of Vienna, the project advanced in reducing the complexity of data from the system under study, with direct application to neural recordings. High-dimensional dynamics involving vast amounts of data points and interactions were transformed into a simpler, lower-dimensional representation without losing essential information. This process ensures that the underlying cause-and-effect relationships remain intact and understandable, which is crucial for making the machine's decision-making process transparent and justifiable. - Method to Measure Exploration in Reinforcement Learning: In collaboration with Inria, Lille, we developed a new approach to quantify how much exploration an artificial intelligence system needs when learning through reinforcement learning compared to more direct learning methods. This work measures how often the system explores new strategies to improve its decision-making capabilities. - Modelling Continuous State-Action-Time Systems: We worked on adapting reinforcement learning for systems where changes occur continuously over time in a continuous state-action space, using advanced neural networks capable of solving differential equations. This work is pivotal for applications requiring real-time decision-making. The project also embarked on developing learning algorithms that mimic biological processes, aiming for more natural and efficient machine-learning methods. |
Exploitation Route | The aim of making machine learning algorithms explainable is a challenging endeavour and requires extensive research. The findings and advancements from this project set a foundational framework and pave the way for applications in causal, explainable reinforcement learning. The potential avenues for taking the findings forward include: Future Research and Grants: The strategies, techniques, and algorithms we've developed serve as a robust foundation for future scientific inquiry and innovation in this area. This includes the continuation of development through further funding. Resources for the Global Research Community: By making our research findings and the software we have developed publicly available, we provide tools that other researchers can utilise in applications, particularly in the area of neuroscience, and build upon to tackle this challenging problem. |
Sectors | Digital/Communication/Information Technologies (including Software) |
Description | EPSRC Doctoral Training Partnership (DTP) - Early Career Researcher Scholarship Award |
Amount | £160,000 (GBP) |
Organisation | Engineering and Physical Sciences Research Council (EPSRC) |
Sector | Public |
Country | United Kingdom |
Start | 01/2022 |
End | 07/2025 |
Title | BunDLe-Net |
Description | Behavioural and Dynamic Learning Network (BunDLe Net) is an algorithm to learn meaningful coarse-grained representations from time-series data. It maps high-dimensional data to low-dimensional space while preserving both dynamical and behavioural information. It has been applied, but is not limited to neuronal manifold learning. |
Type Of Material | Computer model/algorithm |
Year Produced | 2023 |
Provided To Others? | Yes |
Impact | We expect this to develop as a key tool for dimensionality reduction of neural data. |
URL | https://github.com/akshey-kumar/BunDLe-Net |
Description | Dr Debabrota Basu |
Organisation | Inria research centre Lille - Nord Europe |
Country | France |
Sector | Public |
PI Contribution | Sheffield, Gilra, Vasilaki, and Manneschi collaboratively conducted staff training sessions to ensure a high level of competency across the team. Gilra supervised the entire project but also delineated the research, developed a robust methodology, and played a pivotal role in the manuscript's composition. |
Collaborator Contribution | Basu participated in research discussions, refined the methodology, and contributed to editing of the paper. |
Impact | Nkhumise RM, Basu D, Prescott TJ, Gilra A. Measuring Exploration in Reinforcement Learning via Optimal Transport in Policy Space. arXiv; 2024. doi:10.48550/arXiv.2402.09113 |
Start Year | 2021 |
Description | Professor Moritz Grosse-Wentrup |
Organisation | University of Vienna |
Country | Austria |
Sector | Academic/University |
PI Contribution | Gilra offered expertise in neural network design and neural data analysis. He contributed to research discussions, refined methodology and co-authored a research publication. |
Collaborator Contribution | Grosse-Wentrup contributed his expertise in causal modelling, facilitated the training of staff, and provided essential facilities, including lab space. Furthermore, he led the development of the methodology and guided the authoring of the manuscript. |
Impact | Kumar A, Gilra A, Gonzalez-Soto M, Meunier A, Grosse-Wentrup M. BunDLe-Net: Neuronal Manifold Learning Meets Behaviour. bioRxiv; 2023. p. 2023.08.08.551978. doi:10.1101/2023.08.08.551978 |
Start Year | 2022 |
Description | Annual talk presentation on our project CausalXRL to Chist-era projects seminar meetings for running projects funded by Chist-era |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | Our grant CausalXRL is funded under the Chist-era call for eXplainable AI (XAI) 2020. Chist-era organizes an yearly meeting of all ongoing funded projects. Acting coordinator Gilra represented our CausalXRL project and gave 10-min talks at each of these annual meetings. Those in 2021 and 2022 were online, while the one in 2023 was in Bratislava, Slovakia. The upcoming 2024 one is in Finland and Gilra will present there as well. |
Year(s) Of Engagement Activity | 2021,2022,2023,2024 |
URL | https://www.chistera.eu/projects-seminar-2023-programme |
Description | Poster presentation at Cognitive Computational Neuroscience international conference at Oxford UK |
Form Of Engagement Activity | A talk or presentation |
Part Of Official Scheme? | No |
Geographic Reach | International |
Primary Audience | Professional Practitioners |
Results and Impact | The co-authors of https://www.biorxiv.org/content/10.1101/2023.08.08.551978v3 including Gilra presented a poster on their work at the Cognitive Computational Neuroscience (CCN) yearly international conference in 2023 held at Oxford, UK. Apart from visibility to various researchers in the field, there were active discussions on our work with some researchers, both at the poster presentation and elsewhere at the conference. |
Year(s) Of Engagement Activity | 2023 |
URL | https://2023.ccneuro.website/view_paper6f34.html?PaperNum=1089 |