Discovery of input/output Casual relations in Software Systems

Lead Research Organisation: University College London

Department Name: Computer Science

Abstract

Despite digital technologies have notably revolutionized daily life, security and privacy concerns have emerged alongside. The complexity of their tasks has made automated systems ever more difficult to understand, causing a progressive loss of confidence, especially in critical scenarios involving sensitive data. Current interpretability methods tackle the opacity problem only from partial perspectives: suited to specific systems, highlighting statistical correlations, or providing too technical explanations.
Our research places in the Causal Discovery area and aims thus to answer the critical question "why did a system make that decision?" lifting the analysis to higher levels of abstraction. We propose a novel General Causal Explanation Method for testing the behavioural logic of software systems on large-scale, serving a dual purpose: provide human-understandable explanations of the reasons why an automated procedure comes to the outcome in terms of input categories that directly affect it; assess whether a system's decision logic violates predefined software specifications.
We leverage on Information Theory and a Lattice structure of partitions defined on the input space of the systems under test (SUT), treated as black-box: we only require access to input/output (I/O) interfaces. We develop an algorithm that investigates I/O causal relationships by performing Conditional Independence Testing via Conditional Mutual Information (CMI). The idea is look for the smallest subset of input variables (input part) that when altered cause a change in the output (or given output part). An elimination strategy is used to build explanations, excluding input variables with null CMI.
The three core pillars of our research are summarized below:
1) Information-Theory grounded General Causal Explanation Framework: evaluation of the methodology in different application areas.
We show the versatility of our method in testing software and properties of different nature. We start focusing on Machine Learning-based predictive systems involving sensitive input categories, Programs implementing security policies and Image recognition systems. After dealing with Fairness, Information leakage and misclassification in black-box scenarios, we'll introduce a fourth white-box case study. With the growing number of studies focused on testing Social Network platforms, we find interesting to adopt our explanation methodology to model simulated social interactions by leveraging on causal reasoning.
2) Enhance Software Testing Performance adding Statistical Guarantee.
We aim to improve our Test Set and information-theoretic measurements' quality by providing our testing approach with statistical guarantee. We investigate existing statistical methods used for estimating Mutual Information looking for the most suitable in the context of our study, that comes with a good statistical confidence level. This would allow to: provide more rigorous findings, argue that the detected influential and non-influential parts are correct with a high-level confidence; release our findings from approximations and test suite size.
3) Generalise Causal Analysis with Directed Information (DI).
We discuss the link between Granger Causality and DI Theory showing that DI algebraic structure, simplified to our scenarios, aligns with CMI. Expand our generalised approach using DI becomes thus a compelling direction for our research, especially w.r.t. more complex scenarios where the direction of the I/O relationship is ambiguous. An interesting application context is given by interactive programs (webpages, editors) where the information can flow in different directions, given the existence of multiple users interacting with the program simultaneously.
We expect to deliver a broad range of contributions both to users and developers, enabling informed usage of critical software systems, detecting and mitigating threats to fairness and security, and enhancing the overall q

Planned Impact

The EPSRC Centre for Doctoral Training in Cybersecurity will train over 55 experts in multi-disciplinary aspects of cybersecurity, from engineering to crime science and public policy.

Short term impacts are associated with the research outputs of the 55+ research projects that will be undertaken as part of the doctoral studies of CDT students. Each project will tackle an important cybersecurity problem, propose and evaluate solutions, interventions and policy options. Students will publish those in international peer-reviewed journals, but also disseminate those through blog posts and material geared towards decision makers and experts in adjacent fields. Through industry placements relating to their projects, all students will have the opportunity to implement and evaluate their ideas within real-world organizations, to achieve short term impact in solving cybersecurity problems.

In the longer term graduates of the CDT will assume leading positions within industry, goverment, law enforcement, the third sector and academia to increase the capacity of the UK in being a leader in cybersecurity. From those leadership positions they will assess options and formulate effective interventions to tackle cybercrime, secure the UK's infrastructure, establish norms of cooperation between industries and government to secure IT systems, and become leading researcher and scholars further increasing the UK's capacity in cybersecurity in the years to come. The last impact is likely to be significant give that currently many higher education training programs do not have capacity to provide cybersecurity training at undergraduate or graduate levels, particularly in non-technical fields.

The full details of our plan to achieve impact can be found in the "Pathways to Impact" document.

Student:

Ilaria La Torre

Period of Study:

Oct 20 - Sep 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Studentship

Project Reference:

2399261

Research Topic:

Unclassified

Organisations

University College London (Lead Research Organisation)

People	ORCID iD
David Clark (Primary Supervisor)
Ilaria La Torre (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/S022503/1			01/04/2019	23/11/2028
2399261	Studentship	EP/S022503/1	01/10/2020	30/09/2024	Ilaria La Torre