Machine Learning in Automated Theorem Proving

Lead Research Organisation: University of Cambridge
Department Name: Computer Science and Technology

Abstract

Reasoning is an essential component of a general intelligence, but has proven difficult to fully automate. Interactive provers that assist a human expert are still more prevalent than a successful fully-automated proof search. However, in the last years fully-automated Machine Learning methods set state-of-the-art results in several reasoning challenges that used to require human supervision and handcrafted feature engineering, e.g. machine translation (Wu et al., 2016), visual question answering (Kahou et al., 2017), reasoning on algorithmic tasks (Zaremba et al., 2014), and excelling at abstract strategy games (Silver et al., 2016). Some of these tasks draw similarities to the main bottlenecks in the ATP. For instance, the choice of the winning strategy in a game as well as decisions required to prove a theorem can be similarly modelled as search problems. My goal will be to address these challenges by developing new approaches to fill the gaps in ATP using Machine Learning.


The challenges in ATP that traditionally require human supervision are posed, for instance, by selection of the most useful mathematical statements to prove a conjecture (premise selection), the choice of an optimal strategy for a proof search (heuristic selection), or semantically relevant translation of a mathematical content (e.g. expressed in LaTeX) to the formal language of a prover (auto-formalization). Modern Machine Learning methods have been recently applied to all of the problems mentioned, and yielded promising results, for instance (Alemi et al., 2017) and (Bridge et al., 2014). Based on the state-of-the-art research, I would like to explore the combination of employing Reinforcement Learning methods to guide a proof search, and powerful learning algorithms such as Deep Neural Networks to extract the most relevant features, and to restrict the search space for the decision process.


There is a potential to make a significant progress in the research on semantically accurate representations of the mathematical statements. I would like to contribute to the ATP landscape by developing new methods at the interface of Neural Networks and Natural Language Processing, including graph embeddings that are the most natural way of representing mathematical formulae. In particular, I would like to investigate the more expressive Higher-order logic, using an open-source dataset released for the evaluation of new Machine Learning-based theorem proving methods (Kaliszyk et al., 2017).

In conclusion, I strive to improve the automation of theorem proving by using the most promising advances in Machine Learning. Performing the proposed research thanks to the EPSRC studentship will allow me to contribute to the key aspect of artificial intelligence, as well as to the applied fields that employ ATP, notably to the safety-critical software and hardware design. Moreover, this research aims at showing steps towards a better understanding and development of generally applicable Machine Learning methods.

Publications

10 25 50
 
Description In the last 2 years of my funding, I developed a new dataset, a new multi-agent game and several new variants of neural networks.
Exploitation Route The outcomes can be used in research related to graph classification (for example, automatic prediction of molecular properties), emergence of human language, improving data efficiency of neural networks used in image classification. The most relevant field is machine learning (with two short papers at the intersection of machine learning and automated theorem proving).
Sectors Chemicals,Digital/Communication/Information Technologies (including Software),Education,Financial Services, and Management Consultancy,Healthcare,Pharmaceuticals and Medical Biotechnology,Other

URL https://scholar.google.com/citations?user=BPqK1l0AAAAJ&view_op=list_works&sortby=pubdate