📣 Help Shape the Future of UKRI's Gateway to Research (GtR)

We're improving UKRI's Gateway to Research and are seeking your input! If you would be interested in being interviewed about the improvements we're making and to have your say about how we can make GtR more user-friendly, impactful, and effective for the Research and Innovation community, please email gateway@ukri.org.

Cross-Layer Uncertainty-Aware Reinforcement Learning for Safe Autonomous Driving

Lead Research Organisation: University of Southampton
Department Name: Sch of Electronics and Computer Sci

Abstract

Autonomous driving (AD) has a huge market and IS receiving enormous attention in both academia and industry. To deal with complex scenarios, autonomous vehicles (AVs) will use reinforcement learning (RL) to design high-level planners in the functional layer but always suffer from safety issues during sim-to-real transfer. One of the main challenges is that the current practice of functional-layer design does not sufficiently consider the uncertainty in the architecture layer, e.g., the software layer and hardware layer. This open challenge will be tackled in this project by a comprehensive study of the interaction between RL and architecture-layer uncertainty. Specifically, we will build virtual AD scenarios on the simulation platform with formal modeling of architecture-layer uncertainty based on real-world data (WP1). The impact of uncertainties on RL will be discussed via the design of cross-layer uncertainty-aware RL (WP2). Inversely, we will also study the robustness of an RL with respect to cross-layer uncertainty by computing the Pareto front of the largest software/hardware uncertainty patterns that a given RL is robust to (WP3). Extensive analysis including verification (WP2, WP3), simulation (WP2, WP3), and real-world experiments (WP4) will be carried out. The success of this project will greatly improve the practicability of RL in AD with a broader impact on other robotics applications.
 
Title Time-delay reinforcement learning with auxiliary short delays 
Description This is an open-source tool that enables efficient time-delay reinforcement learning. 
Type Of Material Computer model/algorithm 
Year Produced 2024 
Provided To Others? Yes  
Impact We had a conference paper based on this algorithm. 
URL https://github.com/QingyuanWuNothing/AD-RL
 
Title Vision-based safe reinforcement learning Tool 
Description This is the algorithm we propose for safe reinforcement learning with vision inputs. 
Type Of Material Computer model/algorithm 
Year Produced 2024 
Provided To Others? Yes  
Impact We had a conference publication based on this algorithm. 
URL https://github.com/SimonZhan-code/Step-Wise_SafeRL_Pixel
 
Description Main collaborators 
Organisation Nanyang Technological University
Country Singapore 
Sector Academic/University 
PI Contribution We lead the project, formulating the research problems, proposing new algorithms, and conducting experiments.
Collaborator Contribution My partners discuss with us every other week and provide suggestions. They also help with the experiments.
Impact The collaboration produces 3 publications and 2 open-source tools. Publication 1: Polar-express: Efficient and precise formal reachability analysis of neural-network controlled systems Publication 2: Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Publication 3: State-wise safe reinforcement learning with pixel observations Publication 4: Case Study: Runtime Safety Verification of Neural Network Controlled System Tool 1: Vision-based safe reinforcement learning Tool 2: Time-delay reinforcement learning with auxiliary short delays
Start Year 2024
 
Description Main collaborators 
Organisation National Taiwan University
Country Taiwan, Province of China 
Sector Academic/University 
PI Contribution We lead the project, formulating the research problems, proposing new algorithms, and conducting experiments.
Collaborator Contribution My partners discuss with us every other week and provide suggestions. They also help with the experiments.
Impact The collaboration produces 3 publications and 2 open-source tools. Publication 1: Polar-express: Efficient and precise formal reachability analysis of neural-network controlled systems Publication 2: Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Publication 3: State-wise safe reinforcement learning with pixel observations Publication 4: Case Study: Runtime Safety Verification of Neural Network Controlled System Tool 1: Vision-based safe reinforcement learning Tool 2: Time-delay reinforcement learning with auxiliary short delays
Start Year 2024
 
Description Main collaborators 
Organisation Northwestern University
Country United States 
Sector Academic/University 
PI Contribution We lead the project, formulating the research problems, proposing new algorithms, and conducting experiments.
Collaborator Contribution My partners discuss with us every other week and provide suggestions. They also help with the experiments.
Impact The collaboration produces 3 publications and 2 open-source tools. Publication 1: Polar-express: Efficient and precise formal reachability analysis of neural-network controlled systems Publication 2: Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Publication 3: State-wise safe reinforcement learning with pixel observations Publication 4: Case Study: Runtime Safety Verification of Neural Network Controlled System Tool 1: Vision-based safe reinforcement learning Tool 2: Time-delay reinforcement learning with auxiliary short delays
Start Year 2024
 
Description New Collaboration - Florida 
Organisation University of Florida
Country United States 
Sector Academic/University 
PI Contribution I provided support for using my tool POLAR in the project, technical advice, and revision of the paper draft.
Collaborator Contribution They led the project.
Impact The output is one publication. Paper 1: Bridging Dimensions: Confident Reachability for High-Dimensional Controllers
Start Year 2024
 
Description Visit Verimag, France 
Form Of Engagement Activity A talk or presentation
Part Of Official Scheme? No
Geographic Reach Local
Primary Audience Schools
Results and Impact Prof. Thao Dang invited me to Verimag, France, to give a talk about my recent research and discuss potential collaborations. Around 20 researchers attended my talk.
Year(s) Of Engagement Activity 2024