Devising robust Multi-Armed Bandit algorithms in the presence of non-stationarities and long-range dependencies

Lead Research Organisation: Lancaster University

Department Name: Mathematics and Statistics

Abstract

The Multi-Armed Bandit (MAB) problem is one of the most central instances of sequential decision making under uncertainty, which plays a key role in online learning and optimization. MABs arise in a variety of modern real-world applications, such as online advertisement, Internet routing, and sequential portfolio selection, only to name a few. In this problem, a forecaster aims to maximize the expected sum of the rewards actively collected from unknown processes. MABs are typically studied under the assumption that the rewards are i.i.d.. However, this assumption does not necessarily hold in many practical situations. The objective of this project is to analyze the possibilities and limitations of more challenging, yet more realistic (restless) MAB settings, where the reward distributions may exhibit long-range dependencies and may possess potential non-stationarities. As part of the project, novel MAB strategies with good performance guarantees will be sought, and applications to real-world problems will be explored.

Student:

Ali Arabzadeh

Period of Study:

Oct 20 - Sep 24

Funder:

EPSRC

Project Status:

Active

Project Category:

Studentship

Project Reference:

2437073

Research Topic:

Unclassified

Organisations

Lancaster University (Lead Research Organisation)

People	ORCID iD
Ali Arabzadeh (Student)

Publications

Author Name

Title Publication Date Published

10 25 50

Studentship Projects

Project Reference	Relationship	Related To	Start	End	Student Name
EP/V520214/1			01/10/2020	31/10/2025
2437073	Studentship	EP/V520214/1	01/10/2020	30/09/2024	Ali Arabzadeh

Abstract

Organisations

People

ORCID iD

Publications

Studentship Projects