Safety, robustness, and economic properties of machine learning

Lead Research Organisation: University of Oxford

Department Name: Computer Science

Abstract

This project falls within the EPSRC Information and communication technologies (ICT) research area. It also interfaces with the Digital Economy and Engineering areas.

Under the supervision of Prof Yarin Gal and Prof Allan Dafoe, I will empirically and theoretically study safety and robustness of machine learning methods as well as the economic properties of the technology.

As machine learning algorithms are becoming more capable, the number of safety-critical environments in which they will be deployed increases. For example, a stock-trading algorithm has to conform to certain rules which are not easy to hard-code as a training signal. Therefore, a reinforcement learning algorithm could find ways to execute winning but illegal strategies that circumvent any ad-hoc objective that is meant to discourage such behavior.

On a high level, we are interested in fundamental research on robust (deep) learning methods that can be used to act on behalf of humans without costly supervision. Within well-controlled domains such as Atari, we can see that current ML techniques can scale far, indicating that they could scale quite far in more realistic domains. We are particularly interested in robustness to changing distributions, a poorly understood problem, and in alignment with human preferences, a critical step towards safe and useful AI systems. A non-robust system, deployed on a novel distribution, may badly misunderstand its situation, and thus may make harmful decisions confidently. An imperfectly aligned system may be exploited with adversarial inputs or may optimize away the correlation between its stated and intended objective.

My project will involve the application or Bayesian neural networks to such safety problems in machine learning. Prof Gal has significantly advanced the development of these networks and they can help to detect and adapt to distribution shift (robustness) or actively solicit data about human preferences (alignment).

Furthermore, we will research economic properties of machine learning as a technology. In economics, a technology is defined by its production function which relates valuable inputs to outputs. In machine learning, those inputs include data, compute, and labor. The outputs depend on the specific task. They might be measures such as test errors, but also the value of the resulting product that uses a learned model (e.g. an application or a stock trading algorithm). Estimating such a production function is one of the oldest empirical problems in economics, but has not been explicitly done for machine learning. The methodology involves microeconomic modeling which machine learning researchers have so far not used.

Student:

Soren Mindermann

Period of Study:

Sep 19 - Mar 23

Funder:

EPSRC

Project Status:

Closed

Project Category:

Studentship

Project Reference:

2219023

Research Topic:

Unclassified

Organisations

University of Oxford (Lead Research Organisation)

People	ORCID iD
Soren Mindermann (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/R513295/1			30/09/2018	29/09/2023
2219023	Studentship	EP/R513295/1	30/09/2019	30/03/2023	Soren Mindermann

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects