Bayesian methods for learning and satisfaction of safety constraints

Lead Research Organisation: University of Oxford

Department Name: Engineering Science

Abstract

Brief description of the context of the research including potential impact
Given a goal in the real world, autonomous agents may come up with solutions that are undesirable to humans, ranging from mildly inconveniencing to life-threatening. This could be addressed by a set of constraints on their
behaviour. However, especially if such constraints are supposed to represent complex human preferences, they will themselves need to be inferred from various sources (such as human behaviour or explicit feedback) and
they will never be known with certainty. Probabilistic modeling can account for uncertainty both about the constraints themselves (what outcomes and behaviours should be avoided) and about the world dynamics (which outcomes a behaviour could lead to); however, the resulting Bayesian optimization problem runs into tractability issues. Various methods stemming from inverse reinforcement learning have recently been proposed for learning human preferences, including learning constraints. However, most proposed algorithms in this area deviate from the Bayesian framework in ways that do not preserve its desirable properties such as calibration, which may be essential especially for
safety-critical constraints.
Aims and Objectives
This DPhil project will try to develop alternatives to these preference and constraint learning algorithms using principled Bayesian approaches, drawing on methods from areas such as Gaussian processes, Bayesian
optimization, and Bayesian inverse reinforcement learning. It will also examine methods for probabilistic planning and probabilistic verification to ensure that once a set of preferences and constraints is inferred, we can
guarantee their satisfaction with a sufficient degree of confidence.
Novelty of the research methodology
The whole literature on inferring constraints from observations of an agent's behaviour is limited, but there is
almost no work using Bayesian methods. This project will partly aim to fill this gap. Similarly, the work will
aim to fill gaps in principled approaches for probabilistic guarantees on constraint satisfaction in forward
behaviour planning.
Alignment to EPSRC's strategies and research areas
The project naturally fits under EPSRC's research areas of artificial intelligence technologies and verification
and correctness, with possible overlaps also with control engineering whose techniques it might use to provide
safety guarantees for autonomous AI systems. Algorithms developed as part of the project would also have
natural applications in the research area of robotics.
Companies or collaborators involved
No external collaboration is currently arranged.

Planned Impact

AIMS's impact will be felt across domains of acute need within the UK. We expect AIMS to benefit: UK economic performance, through start-up creation; existing UK firms, both through research and addressing skills needs; UK health, by contributing to cancer research, and quality of life, through the delivery of autonomous vehicles; UK public understanding of and policy related to the transformational societal change engendered by autonomous systems.

Autonomous systems are acknowledged by essentially all stakeholders as important to the future UK economy. PwC claim that there is a £232 billion opportunity offered by AI to the UK economy by 2030 (10% of GDP). AIMS has an excellent track record of leadership in spinout creation, and will continue to foster the commercial projects of its students, through the provision of training in IP, licensing and entrepreneurship. With the help of Oxford Science Innovation (investment fund) and Oxford University Innovation (technology transfer office), student projects will be evaluated for commercial potential.

AIMS will also concretely contribute to UK economic competitiveness by meeting the UK's needs for experts in autonomous systems. To meet this need, AIMS will train cohorts with advanced skills that span the breadth of AI, machine learning, robotics, verification and sensor systems. The relevance of the training to the needs of industry will be ensured by the industrial partnerships at the heart of AIMS. These partnerships will also ensure that AIMS will produce research that directly targets UK industrial needs. Our partners span a wide range of UK sectors, including energy, transport, infrastructure, factory automation, finance, health, space and other extreme environments.

The autonomous systems that AIMS will enable also offer the prospect of epochal change in the UK's quality of life and health. As put by former Digital Secretary Matt Hancock, "whether it's improving travel, making banking easier or helping people live longer, AI is already revolutionising our economy and our society." AIMS will help to realise this potential through its delivery of trained experts and targeted research. In particular, two of the four Grand Challenge missions in the UK Industrial Strategy highlight the positive societal impact underpinned by autonomous systems. The "Artificial Intelligence and data" challenge has as its mission to "Use data, Artificial Intelligence and innovation to transform the prevention, early diagnosis and treatment of chronic diseases by 2030". To this mission, AIMS will contribute the outputs of its research pillar on cancer research. The "Future of mobility" challenge highlights the importance the autonomous vehicles will have in making transport "safer, cleaner and better connected." To this challenge, AIMS offers the world-leading research of its robotic systems research pillar.

AIMS will further promote the positive realisation of autonomous technologies through direct influence on policy. The world-leading academics amongst AIMS's supervisory pool are well-connected to policy formation e.g. Prof Osborne serving as a Commissioner on the Independent Commission on the Future of Work. Further, Dr Dan Mawson, Head of the Economy Unit; Economy and Strategic Analysis Team at BEIS will serve as an advisor to AIMS, ensuring bidirectional influence between policy objectives and AIMS research and training.

Broad understanding of autonomous systems is crucial in making a society robust to the transformations they will engender. AIMS will foster such understanding through its provision of opportunities for AIMS students to directly engage with the public. Given the broad societal importance of getting autonomous systems right, AIMS will deliver core training on the ethical, governance, economic and societal implications of autonomous systems.

Student:

Ondrej Bajgar

Period of Study:

Oct 20 - Sep 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Studentship

Project Reference:

2416389

Research Topic:

Unclassified

Organisations

People	ORCID iD
Alessandro Abate (Primary Supervisor)
Michael Osborne (Primary Supervisor)
Konstantinos Gatsis (Primary Supervisor)	http://orcid.org/0000-0002-0734-5445
Ondrej Bajgar (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/S024050/1			01/10/2019	31/03/2028
2416389	Studentship	EP/S024050/1	01/10/2020	30/09/2024	Ondrej Bajgar