Reinforcement Learning for Finite Horizons (ReLeaF)

Lead Research Organisation: University of Liverpool

Department Name: Computer Science

Abstract

Reinforcement learning (RL) is a technique for learning how to take actions in an initially unknown environment in order to optimise an expected outcome, which is modelled through the notion of maximising an accumulative reward. Learning algorithms with goals written as temporal specifications have three key ingredients: the translation from the specification to appropriate finite automata; the translation of these finite automata to reward structures, such that a strategy that provides optimal rewards is guaranteed to provide optimal control; and a wrapper into a discounting scheme that, for appropriate parameters, will ensure that a learner converge to an optimal strategy.

We will consider the RL problems for a popular specification language used in automation and motion planning, the finite horizon linear time temporal logic LTLf. In particular, we will study model-free RL algorithms, which are more suitable to real-world applications where the behaviour of the environment is hard to predict, than its model-based counterpart. We will propose learning algorithms that provide translations from finite horizon LTL to reward structures with formal guarantees of satisfying the given goals for environments modelled as Markov Decision Processes (MDPs).

We will extend our techniques to infinite-state MDPs, including variations where formal guarantees can be provided -- like countable, finitely branching MDPs -- and study conditions for our techniques to provide guarantees in more general classes, such as smoothness guarantees for compact MDPs.

We will complement these lines of research by looking at goals with constraints. This is effectively considering prioritised goals, where meeting safety constraints takes precedence, while other properties -- such as efficiency -- are considered as tie-breakers among strategies that provide the same safety guarantees.

Funded Value:

£204,031

Funded Period:

Oct 22 - Oct 24

Funder:

Horizon Europe Guarantee

Project Status:

Active

Project Category:

Fellowship

Project Reference:

EP/X021513/1

Principal Investigator:

Sven Schewe

Yong Li

Research Subject:

Info. & commun. Technol. (66%)

Mathematical sciences (33%)

Research Topic:

Artificial Intelligence (33%)

Fundamentals of Computing (33%)

Logic & Combinatorics (33%)

Organisations

University of Liverpool (Fellow, Lead Research Organisation)

People	ORCID iD
Sven Schewe (Principal Investigator)	http://orcid.org/0000-0002-9093-9518
Yong Li (Principal Investigator / Fellow)

Publications

Author Name

Title Publication Date Published

10 25 50

Bansal S (2023) Automated Technology for Verification and Analysis - 21st International Symposium, ATVA 2023, Singapore, October 24-27, 2023, Proceedings, Part I

Feng W (2023) On the power of finite ambiguity in Büchi complementation in Information and Computation

Havlena V (2023) Tools and Algorithms for the Construction and Analysis of Systems - 29th International Conference, TACAS 2023, Held as Part of the European Joint Conferences on Theory and Practice of Software, ETAPS 2023, Paris, France, April 22-27, 2023, Proceedings, Part I

Li Y (2023) Automated Technology for Verification and Analysis - 21st International Symposium, ATVA 2023, Singapore, October 24-27, 2023, Proceedings, Part I

Abstract

Organisations

People

ORCID iD

Publications