Towards Explainable and Robust Statistical AI: A Symbolic Approach

Lead Research Organisation: University of Edinburgh

Department Name: Sch of Informatics

Abstract

Data science provides many opportunities to improve private and public life, and it has enjoyed significant investment in the UK, EU and elsewhere. Discovering patterns and structures in large troves of data in an automated manner - that is, machine learning - is a core component of data science. Machine learning currently drives applications in computational biology, natural language processing and robotics. However, such a highly positive impact is coupled to a significant challenge: when can we convincingly deploy these methods in our workplace? For example:

(a) how can we elicit intuitive and reasonable responses from these methods?

(b) would these responses be amenable to suggestions/preferences/constraints from non-expert users?

(c) do these methods come with worst-case guarantees?

Such questions are clearly vital for appreciating its benefits in human-machine collectives.

This project is broadly positioned in the context of establishing a general computational framework to aid explainable and robust machine learning. This framework unifies probabilistic graphical models, which forms the statistical basis for many machine learning methods, and relational logic, the language of classes, objects and composition. The framework allows us to effectively codify complex domain knowledge for big uncertain data.

Concretely, the project aims to learn a model that best summarises the observed data in a completely automated fashion, thereby accounting of both observable and hidden factors in that data. To provide guarantees, two distinct algorithms are considered:

(a) an algorithm that learns simple models with exact computations;

(b) an algorithm that learns complex models but rests on approximations with certificates.

To evaluate the explainable, interactive nature of the learned models, the project considers the application of dialogue management with spatial primitives (e.g., "turn south after the supermarket"). We will study the scalability of these algorithms, and then evaluate the closeness of the learned models to actual suggestions from humans.

Computationally efficient and explainable algorithms will significantly expand the range of applications to which the probabilistic machine learning framework can be applied in society and contribute to the "democratisation of data."

Planned Impact

The work in this proposal has the potential for substantial economic benefits, because better modelling languages, faster and robust inference algorithms will enable the graphical modelling framework to be applied with less effort and more confidence to a broader variety of problems. Graphical models provide a principled reasoning framework that has proven successful for problems in which there is noisy data and unknown hidden variables. Naturally, this applies to many applications and domains, both in academia and industry. Indeed, graphical models have supercharged the use of statistical methods in robotics, vision, and medical diagnosis, as well as core technologies thereof, such as deep generative neural networks. Despite this success, actually specifying these models is very challenging, especially for non-experts, and in that regard, probabilistic relational models have helped enormously. Our project directly builds on these successes, and will allow these models to finally reason about continuous data, which are very common in real-world applications (e.g., measurement errors, temporally-indexed values such as stock price fluctuations).

The following will benefit from the direction of this research:

- The general public: If users do not understand the inner workings of machine learning models and also cannot extract meaningful behaviour from them, their applicability is likely to be limited to the select few who are technologically-gifted; this is likely to deepen social inequality. Explainable algorithms would help realise machine learning systems as an enabling technology, especially in human-machine collectives, to achieve goals collaboratively.

- Private sector: Smart services and intelligent programs are becoming increasingly ubiquitous, and are often situated in larger smart environments. By identifying the foundations for compositionality in machine learning, technology-oriented companies can capitalise on these techniques to engineer their systems. Moreover, robustness can help such companies understand the risks of deploying a probabilistic machine learning method in a social context, in the sense of avoiding catastrophic outcomes. Finally, regulations on explainability in algorithms will certainly affect the packaging of AI technologies in products.

- Research community: They will gain insights on expressive modelling languages and robust machine learning techniques. The problem is that graphical models can be difficult to apply for non-experts, especially if they have to design their own custom inference technique. Our outcomes will contribute to alleviating these challenges.

To reach out to these beneficiaries on the project's outputs, we will:

(a) publish the results in the top venues for AI, and engage with industry labs;

(b) engage with non-experts (via discussion forums such as the Edinburgh International Development Society);

(c) make benchmarks and software open access and publicise them on, e.g., beyondnp.org, which is a software repository for solvers of problem domains beyond the NP complexity class;

(d) deliver seminars, lectures, and enhance teaching modules based on the outcomes of this project; and

(e) organise a research workshop on symbolic methods for explainable AI, to which we will invite industry thought leaders invested in statistical relational learning and human-machine interaction, such as IBM, Microsoft and Google.

Funded Value:

£100,739

Funded Period:

Jun 18 - Sep 19

Funder:

EPSRC

Project Status:

Closed

Project Category:

Research Grant

Project Reference:

EP/R021317/1

Principal Investigator:

Vaishak Belle

Research Subject:

Info. & commun. Technol. (100%)

Research Topic:

Artificial Intelligence (60%)

Fundamentals of Computing (40%)

Organisations

University of Edinburgh (Lead Research Organisation)

People	ORCID iD
Vaishak Belle (Principal Investigator)	http://orcid.org/0000-0001-5573-8465

Publications

Author Name

Title Publication Date Published

|< < 1 2 > >|

10 25 50

Belle V (2020) Scaling up Probabilistic Inference in Linear and Non-linear Hybrid Domains by Leveraging Knowledge Compilation

Bueff A (2021) Probabilistic Tractable Models in Mixed Discrete-Continuous Domains in Data Intelligence

Bueff A Unsupervised Learning of Tractable Probabilistic Models in Mixed Discrete-Continuous Domains

Fuxjaeger A (2018) Scaling up Probabilistic Inference in Linear and Non-Linear Hybrid Domains by Leveraging Knowledge Compilation

Fuxjaeger A.R. (2020) Scaling up probabilistic inference in linear and non-linear hybrid domains by leveraging knowledge compilation in ICAART 2020 - Proceedings of the 12th International Conference on Agents and Artificial Intelligence

Levray A (2019) Learning Credal Sum-Product Networks

Mendez Lucero MA (2022) Signal Perceptron: On the Identifiability of Boolean Function Spaces and Beyond. in Frontiers in artificial intelligence

Papantonis I (2021) Closed-Form Results for Prior Constraints in Sum-Product Networks. in Frontiers in artificial intelligence

Speichert S (2018) Tractable Querying and Learning in Hybrid Domains via Sum-Product Networks

Speichert S (2018) Learning Probabilistic Logic Programs in Continuous Domains

Further Funding
Software and Technical Products
Engagement Activities


Description	EPSRC IAA PIII055 AI for Credit Risk
Amount	£149,889 (GBP)
Organisation	University of Edinburgh
Sector	Academic/University
Country	United Kingdom
Start	06/2019
End	12/2019


Description	Huawei-Edinburgh Collaboration
Amount	£125,000 (GBP)
Organisation	Huawei Technologies
Sector	Private
Country	China
Start	08/2019
End	09/2022


Description	Royal Society University Research Fellowship
Amount	£800,000 (GBP)
Organisation	The Royal Society
Sector	Charity/Non Profit
Country	United Kingdom
Start	09/2019
End	09/2024


Title	LearnCSPN
Description	It learns sum product networks (a major tractable probabilistic model) over missing data via a credal semantics. It is the outcome of the submission with Amelie Levray.
Type Of Technology	Software
Year Produced	2018
Open Source License?	Yes
Impact	The software will be released as open source as soon as the corresponding paper is published; we will be able to assess impact after that


Title	LearnWMI
Description	It learns sum product networks (a major tractable graphical model) over hybrid discrete continuous data. It is the outcome of the submission with Andreas Bueff and Stefanie Speichert.
Type Of Technology	Software
Year Produced	2018
Open Source License?	Yes
Impact	The software will be released as open source as soon as the corresponding paper is published; we will be able to assess impact after that


Title	WMI-SDD
Description	It provides an inference engine for mixed discrete continuous data via weighted model integration by leveraging propositional knowledge compilation techniques. It is the outcome of the submission with Anton Fuxjaeger.
Type Of Technology	Software
Year Produced	2018
Open Source License?	Yes
Impact	The software will be released as open source as soon as the corresponding paper is published; we will be able to assess impact after that


Title	WMIProbFoil
Description	It learns the structure of ProbLog (a major and influential probabilistic logic programming approach) over mixed discrete continuous data. It is the outcome of the submission with Stefanie Speichert.
Type Of Technology	Software
Year Produced	2018
Open Source License?	Yes
Impact	The software will be released as open source as soon as the corresponding paper is published; we will be able to assess impact after that


Description	Effective inference and learning with probabilistic logical models in continuous domains
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	I gave at a tutorial on effective inference and learning with probabilistic logical models in continuous domains, at ACAI 2018, Summer school on statistical relational AI.
Year(s) Of Engagement Activity	2018
URL	http://acai2018.unife.it


Description	Explainability as a service
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Professional Practitioners
Results and Impact	Gave a talk on Explainability as a service, at The End of Privacy 1.0: Data Portability and Information Rights workshop in London
Year(s) Of Engagement Activity	2018


Description	Interpretability of Algorithmic Systems Workshop
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Policymakers/politicians
Results and Impact	I gave a talk at the workshop on how the synthesis of logic and machine learning, especially areas such as statistical relational learning, can enable interpretability.
Year(s) Of Engagement Activity	2018
URL	https://interpretability.wordpress.com/main/


Description	Invited seminars at Ben-Gurion and Oxford
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	I gave a talk on decision-theoretic planning via probabilistic programming at Oxford and Ben-Gurion University.
Year(s) Of Engagement Activity	2018
URL	http://www.cs.ox.ac.uk/seminars/1932.html


Description	Perspectives on Explainable AI
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Policymakers/politicians
Results and Impact	I gave a talk entitled "Perspectives on Explainable AI," at an interdisciplinary workshop focusing on building trust in AI. Workshop title: Build Trust in AI - Designing for Consent
Year(s) Of Engagement Activity	2018
URL	https://twitter.com/nxtstop1/status/965249154854334464?ref_src=twcamp%5Eshare%7Ctwsrc%5Em5%7Ctwgr%5E...


Description	Probabilistic Planning by Probabilistic Programming: Semantics, Inference and Learning
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	I gave a talk at the Cognitive Robotics Workshop at KR-18, in Tempe, Arizona, USA.
Year(s) Of Engagement Activity	2018
URL	https://www.maskor.fh-aachen.de/events/CogRob2018/


Description	Towards Intepretable & Responsible AI
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	National
Primary Audience	Industry/Business
Results and Impact	I gave a talk at the London Machine Learning Meetup on the above topic.
Year(s) Of Engagement Activity	2018
URL	https://www.meetup.com/London-Machine-Learning-Meetup/events/255428240/


Description	Tutorial on unifying logic, probability and dynamics
Form Of Engagement Activity	A talk or presentation
Part Of Official Scheme?	No
Geographic Reach	International
Primary Audience	Postgraduate students
Results and Impact	I gave the tutorial at the 16th International Conference on Principles of Knowledge Representation and Reasoning / KR 2018.
Year(s) Of Engagement Activity	2018
URL	http://reasoning.eas.asu.edu/kr2018/

Abstract

Planned Impact

Organisations

People

ORCID iD

Publications