Verifying Safety Properties of Multi-agent Machine Learning Systems
Lead Research Organisation:
University of Oxford
Department Name: Computer Science
Abstract
As autonomous systems that exhibit learning become more powerful and more numerous, so too increases the need for human-interpretable explanations of their actions and for guarantees on their performance. In order to make use of the rapidly developing state of the art in agent-based and machine learning (ML) technologies across a wide range of situations, including in safety-critical environments, it is paramount that we are able to efficiently and effectively verify safety properties of these systems. Note that by safety properties we mean here the broad class of specifications that may be used to enforce safe behaviour or restrict dangerous behaviour, rather than a specific, say, control-theoretic or temporal logic definition.
Traditionally such tasks are performed within the field of formal verification using methods that are better-suited to older, symbolic artificial intelligence (AI) as opposed to sub-symbolic or statistical AI, which has recently made large amounts of progress. There is thus a pressing need to identify the extent to which these two paradigms can be brought together in order to solve these problems. Answering this question positively could help develop new techniques and tools for verifying safety properties of large multi-agent systems that utilise ML to adapt to their environment and other agents (both human and artificial). Answering negatively, say by identifying critical theoretical bottlenecks such as in the computational complexity of this approach, could in turn help direct future research towards different solution spaces and clarify the limits of our existing technologies.
Previous work has been done both to logically constrain or guide reinforcement learning (RL) agents (including in model-free domains), as well as in verifying temporal logic properties of multi-agent systems as described by the language of game theory. One intuitive and promising next step would be to combine these two directions in order to attempt to verify properties of multi-agent reinforcement learning systems, as typically modelled by Markov games, something that has not yet been done. The special case of one artificial ML agent and a human (modelled as a boundedly rational agent) that forms a two-player imperfect-information game is also particularly interesting and may bear relevance to concerns about future situations in which such artificial agents become, in some sense, more competent than their human counterparts. Within these settings there are a range of interesting and unexplored variations stemming from differences in preference representation, logics used for specification, stochasticity, agent knowledge, environment dynamics, definitions of stability/equilibria, and more besides.
Importantly, the approach developed in this project should be largely implementation-agnostic; AI is developing at such a rapid pace that the underlying mechanisms for learning and planning should be somewhat abstracted away from this work in order to provide more general theoretical results. With that said, if successful then such results should be both tractable and scalable, and lend themselves towards real-world implementation. Though primarily concerned with high-level logical specifications and control, we expect that existing literature on methods that integrate this with low-level sub-symbolic models, such as in neuro-symbolic systems, will be relevant to some aspects of the project.
This project falls primarily within the 'ICT' and 'Global Uncertainties' EPSRC research themes and is relevant to the following research areas: 'Artificial intelligence technologies', 'Complexity science', 'Control engineering', 'Logic and combinatorics', 'Theoretical computer science', and 'Verification and correctness'.
Traditionally such tasks are performed within the field of formal verification using methods that are better-suited to older, symbolic artificial intelligence (AI) as opposed to sub-symbolic or statistical AI, which has recently made large amounts of progress. There is thus a pressing need to identify the extent to which these two paradigms can be brought together in order to solve these problems. Answering this question positively could help develop new techniques and tools for verifying safety properties of large multi-agent systems that utilise ML to adapt to their environment and other agents (both human and artificial). Answering negatively, say by identifying critical theoretical bottlenecks such as in the computational complexity of this approach, could in turn help direct future research towards different solution spaces and clarify the limits of our existing technologies.
Previous work has been done both to logically constrain or guide reinforcement learning (RL) agents (including in model-free domains), as well as in verifying temporal logic properties of multi-agent systems as described by the language of game theory. One intuitive and promising next step would be to combine these two directions in order to attempt to verify properties of multi-agent reinforcement learning systems, as typically modelled by Markov games, something that has not yet been done. The special case of one artificial ML agent and a human (modelled as a boundedly rational agent) that forms a two-player imperfect-information game is also particularly interesting and may bear relevance to concerns about future situations in which such artificial agents become, in some sense, more competent than their human counterparts. Within these settings there are a range of interesting and unexplored variations stemming from differences in preference representation, logics used for specification, stochasticity, agent knowledge, environment dynamics, definitions of stability/equilibria, and more besides.
Importantly, the approach developed in this project should be largely implementation-agnostic; AI is developing at such a rapid pace that the underlying mechanisms for learning and planning should be somewhat abstracted away from this work in order to provide more general theoretical results. With that said, if successful then such results should be both tractable and scalable, and lend themselves towards real-world implementation. Though primarily concerned with high-level logical specifications and control, we expect that existing literature on methods that integrate this with low-level sub-symbolic models, such as in neuro-symbolic systems, will be relevant to some aspects of the project.
This project falls primarily within the 'ICT' and 'Global Uncertainties' EPSRC research themes and is relevant to the following research areas: 'Artificial intelligence technologies', 'Complexity science', 'Control engineering', 'Logic and combinatorics', 'Theoretical computer science', and 'Verification and correctness'.
Organisations
People |
ORCID iD |
Julian Gutierrez (Primary Supervisor) | |
Lewis Hammond (Student) |
Studentship Projects
Project Reference | Relationship | Related To | Start | End | Student Name |
---|---|---|---|---|---|
EP/R513295/1 | 30/09/2018 | 29/09/2023 | |||
2218880 | Studentship | EP/R513295/1 | 30/09/2019 | 29/06/2024 | Lewis Hammond |