Causality in Complex Data

Lead Research Organisation: University of Oxford

Department Name: Engineering Science

Abstract

Deep learning has lead to great successes in applications such as object detection, speech recognition and natural language understanding. However, current methods rely largely on correlations in the data and thus are unreliable for tasks that require causal knowledge such as in health care, justice or agriculture, leading to problems such as racial bias.

To address these issues researchers have been increasingly turning to the framework of causality which deals with the question of "why". Within this field there have been a substantial number of works focusing on the problem of determining the causal relationships between given variables (for example learning which causes are responsible for a symptom in medicine). However, the problem of obtaining these variables has been largely ignored, under the implicit assumption that they are provided by domain experts such as medical doctors or economists. This presents a substantial limitation to the range of scenarios that the current causal methods can be applied to, as the manual collection of these variables is expensive and the limited set of variables may not capture all properties of the problem under study. It is the aim of our work to help solve these problems by extending the framework of causality to be able to be used with more complex data, such as images or video instead of just single variables, and use the information therein to learn about causal relationships represented in the data.

Our proposed approach focuses on using information from interventions, domain shifts and temporal structure -- factors which animals use but which are usually disregarded in machine learning. We make use of time-based data (such as a video) and exploit the fact that interventions in the real world are relatively sparse and the distributions of variables representing the scenes before and after the intervention are relatively stationary (for example initially a room is dark for a long time, then there is an intervention of flipping the light switch, and finally the room is light for a long time). Using this, we have built a simple detector of interventions and tested it on toy data. Assuming that interventions change only a small part of what happens in a video, we are planning to use this detector to discover the entities that are causally responsible for the behavior of other entities, whatever they may be, in a completely unsupervised way. We are planning to test our method on synthetic data using a modified computer game and then extend it to real-world scenarios such as a video of people performing some actions. Using this method, we are ultimately hoping to make machines learn to automatically detect causal variables from complex inputs and understand how they are causally related.

The applications of our research are very wide-ranging as our proposed method is very general. One example includes robotics, where the causal knowledge learned from a robot's camera could be used by it to perform planning and subsequent determination of which actions to take in its environment. For example, a fire-rescue robot which has learned that pulling a door handle opens a door might be able to rescue a person from inside a building. Another example might be a self-driving car that, given its learned causal knowledge from videos of car crashes, will determine that driving into a wall will cause damage to it, so it will avoid that scenario without being explicitly programmed to do so. The advantage over using regular deep networks in these cases is that the decisions of the autonomous agents are driven by their causal knowledge (instead of just correlations), which might be able to be scrutinized by safety personnel designing safe autonomous driving policies or by an investigator trying to determine why a car crash has occurred.

This project falls within the EPSRC Engineering research area, focusing on Artificial Intelligence Technologies.

Student:

Marian Longa

Period of Study:

Oct 19 - Sep 22

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2283489

Research Topic:

Unclassified

Organisations

University of Oxford (Lead Research Organisation)

People	ORCID iD
Andrea Vedaldi (Primary Supervisor)
Marian Longa (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513295/1			01/10/2018	30/09/2023
2283489	Studentship	EP/R513295/1	01/10/2019	30/09/2022	Marian Longa

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects