Improving Learning via Reasoning

Lead Research Organisation: University of Oxford
Department Name: Computer Science

Abstract

This project falls within the EPSRC Articial Intelligence research area.

In the recent years, machine learning (ML), and in particular deep learning (DL), have proved
to be able to get astonishing results in many fields, such as natural language processing (see,
e.g., [7]) and computer vision (see, e.g., [6]). Nevertheless, these great success stories have been
overshadowed by the birth of adversarial examples, which showed how easily ML methods, and in
particular neural networks (NNs), can be fooled [3].

This has made the need for effective methods - able to effectively design, and later verify,
correct and well-functioning NNs despite their complexity - all the greater. Some early steps have already been
taken in both directions. In particular:

1. from the design point of view, in 2012, Diligenti et al. proposed a framework to incorporate
rst-order logic clauses, which can be seen as an abstract and partial representation of the
environment, into kernel machines [1]. Then, in 2017, the same research group presented
semantic-based regularization [2]: a framework able to bridge the ability of ML to learn from
continuous datapoints with the ability of modeling and learning from high-level semantic
knowledge typical of statistical relational learning.

2. from the verification point of view, most of the performed attempts check NNs' properties
by encoding them into constraint systems, e.g., Huang et al. [4] proposed a verification
framework for feed-forward neural networks based on satisfiability modulo theory (SMT),
while Katz et al. [5] extended an SMT solver to handle the ReLU activation function.
In this project, our goal is (i) to create a general framework in which it is possible to learn not
only from examples, but also from background knowledge, constraints, and specifications (e.g.,
preconditions, postconditions, input, output), especially formulated in logics and/or (ii) to verify
the learned system against the available logical information.

The novelty of this project lies in the integration of deductive and inductive methods. Indeed, even if some work has already been done,
the field is still largely unexplored and presents a high potential. The overall objective of the project is to create more explainable, reliable, and robust NNs and,
more in general, ML methods, so that they can be applied also to safety-critical systems.

Publications

10 25 50

Studentship Projects

Project Reference Relationship Related To Start End Student Name
EP/N509711/1 01/10/2016 30/09/2021
2052861 Studentship EP/N509711/1 01/10/2018 30/09/2021 Eleonora Giunchiglia
 
Description Deep Learning (DL) achieved incredible results in the recent years, creating the illusion that every problem can now be solved by training a neural network. This led many researchers and businesses to apply DL models extensively in many domains, including safety critical ones, where high-stakes decisions need to be taken (see e.g., [1, 2]). As pointed out by Rudin in [4], the careless application of these models can have, and has already had, disastrous consequences: there have been examples of people erroneously denied parole [6], DL-based pollution models predicting that highly polluted air could be safely breathed [3] and generally poor use of limited resources in medicine, criminal justice, finance and in other domains [5]. The objective of my research is thus to develop new DL models that can be applied in every domain, even on safety critical ones.
The models developed in my work are safe because they are guaranteed to be coherent with a set of given hard constraints that can be written by any expert in the field.

In particular:
(i) We developed a new DL model that is guaranteed to output predictions that are coherent with a set of hierarchy constraints, while exploiting the background knowledge expressed by such constraints in order to make better predictions. This model achieves state-of-the-art results, and it is particularly useful in functional genomics. Indeed, the primary goal of functional genomics is to describe the functions and interactions of genes and their products, RNA and proteins. However, in recent years, the generation of proteomic data has increased substantially, and annotating all sequences is costly and time-consuming, making it often unfeasible. It is thus necessary to develop methods like ours that are able to automatise this process and create annotations that are actually deployable in the field (i.e., that are coherent with the constraints). This work resulted in the publication of [7].
(ii) We extended the above model in order to handle more expressive constraints written as normal logic rules. Again, the model is able to not only make predictions that are always coherent with the given constraints, but also to exploit the background knowledge expressed by the constraints to get better predictions. As above, our model is able to beat the other state-of-the-art models whose predictions are guaranteed to be coherent with the constraints.

[1] Ahmed Alaa and Mihaela van der Schaar. AutoPrognosis: Automated clinical prognostic modeling via Bayesian optimization with structured kernel learning. In Proc. of ICML, 2018.
[2] James B. Heaton, Nicholas G. Polson, and Jan Hendrik Witte. Deep learning for finance: deep portfolios. Applied Stochastic Models in Business and Industry, 33(1):3-12, 2017.
[3] Michael McGough. How bad is Sacramento's air, exactly? Google results appear at odds with reality, some say. Sacramento Bee, 2018. URL https://www.sacbee.com/news/california/fires/article216227775.html.
[4] Cynthia Rudin. Stop explaining black box machine learning models for high stakes decisions and use interpretable models instead. Nature Machine Intelligence, 1(5), 2019.
[5] Kush R. Varshney and Homa Alemzadeh. On the safety of machine learning: Cyber-physical systems, decision sciences, and data products. Big data, 5(3), 2016.
[6] Rebecca Wexler. When a computer program keeps you in jail: How computers are harming criminal justice. New York Times, 2017. URL https://www.nytimes.com/2017/06/13/opinion/how-computers-are-harming- criminal-justice.html.
[7] Eleonora Giunchiglia and Thomas Lukasiewicz. Coherent hierarchical multi-label classification networks. In Proc. of NeurIPS, 2020.
Exploitation Route This work represents a first step towards the creation of models that are safe with respect to a set of constraints by construction. In terms of research, the next steps are:
(i) create one standard (or more) to write such constraints, and encourage research scientists to publish new datasets together with the standardised constraints,
(ii) study how to impose even more expressive constraints, and
(iii) develop new models that are able to exploit even better the background knowledge expressed in the given constraints.

At the same time, AI practitioners can benefit from the outcomes of this work and apply (whenever possible) DL models to safety critical domains.
Sectors Financial Services, and Management Consultancy,Healthcare,Pharmaceuticals and Medical Biotechnology,Other

URL https://arxiv.org/pdf/2010.10151.pdf
 
Description We have helped enucleating a new problem in the deep learning community, i.e., machine learning with requirements. This has led to the creation of a dataset for autonomous driving that is annotated with logical requirements and that can be used in the field to train safer models.
Sector Transport