Cross-Layer Uncertainty-Aware Reinforcement Learning for Safe Autonomous Driving
Lead Research Organisation:
University of Southampton
Department Name: Sch of Electronics and Computer Sci
Abstract
Autonomous driving (AD) has a huge market and IS receiving enormous attention in both academia and industry. To deal with complex scenarios, autonomous vehicles (AVs) will use reinforcement learning (RL) to design high-level planners in the functional layer but always suffer from safety issues during sim-to-real transfer. One of the main challenges is that the current practice of functional-layer design does not sufficiently consider the uncertainty in the architecture layer, e.g., the software layer and hardware layer. This open challenge will be tackled in this project by a comprehensive study of the interaction between RL and architecture-layer uncertainty. Specifically, we will build virtual AD scenarios on the simulation platform with formal modeling of architecture-layer uncertainty based on real-world data (WP1). The impact of uncertainties on RL will be discussed via the design of cross-layer uncertainty-aware RL (WP2). Inversely, we will also study the robustness of an RL with respect to cross-layer uncertainty by computing the Pareto front of the largest software/hardware uncertainty patterns that a given RL is robust to (WP3). Extensive analysis including verification (WP2, WP3), simulation (WP2, WP3), and real-world experiments (WP4) will be carried out. The success of this project will greatly improve the practicability of RL in AD with a broader impact on other robotics applications.
Publications
Simon Sinong Z
(2024)
State-Wise Safe Reinforcement Learning with Pixel Observations
Wang Y
(2024)
POLAR-Express: Efficient and Precise Formal Reachability Analysis of Neural-Network Controlled Systems
in IEEE Transactions on Computer-Aided Design of Integrated Circuits and Systems
Ye J
(2024)
SaliencyCut: Augmenting plausible anomalies for anomaly detection
in Pattern Recognition
| Title | Time-delay reinforcement learning with auxiliary short delays |
| Description | This is an open-source tool that enables efficient time-delay reinforcement learning. |
| Type Of Material | Computer model/algorithm |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | We had a conference paper based on this algorithm. |
| URL | https://github.com/QingyuanWuNothing/AD-RL |
| Title | Vision-based safe reinforcement learning Tool |
| Description | This is the algorithm we propose for safe reinforcement learning with vision inputs. |
| Type Of Material | Computer model/algorithm |
| Year Produced | 2024 |
| Provided To Others? | Yes |
| Impact | We had a conference publication based on this algorithm. |
| URL | https://github.com/SimonZhan-code/Step-Wise_SafeRL_Pixel |
| Description | Main collaborators |
| Organisation | Nanyang Technological University |
| Country | Singapore |
| Sector | Academic/University |
| PI Contribution | We lead the project, formulating the research problems, proposing new algorithms, and conducting experiments. |
| Collaborator Contribution | My partners discuss with us every other week and provide suggestions. They also help with the experiments. |
| Impact | The collaboration produces 3 publications and 2 open-source tools. Publication 1: Polar-express: Efficient and precise formal reachability analysis of neural-network controlled systems Publication 2: Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Publication 3: State-wise safe reinforcement learning with pixel observations Publication 4: Case Study: Runtime Safety Verification of Neural Network Controlled System Tool 1: Vision-based safe reinforcement learning Tool 2: Time-delay reinforcement learning with auxiliary short delays |
| Start Year | 2024 |
| Description | Main collaborators |
| Organisation | National Taiwan University |
| Country | Taiwan, Province of China |
| Sector | Academic/University |
| PI Contribution | We lead the project, formulating the research problems, proposing new algorithms, and conducting experiments. |
| Collaborator Contribution | My partners discuss with us every other week and provide suggestions. They also help with the experiments. |
| Impact | The collaboration produces 3 publications and 2 open-source tools. Publication 1: Polar-express: Efficient and precise formal reachability analysis of neural-network controlled systems Publication 2: Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Publication 3: State-wise safe reinforcement learning with pixel observations Publication 4: Case Study: Runtime Safety Verification of Neural Network Controlled System Tool 1: Vision-based safe reinforcement learning Tool 2: Time-delay reinforcement learning with auxiliary short delays |
| Start Year | 2024 |
| Description | Main collaborators |
| Organisation | Northwestern University |
| Country | United States |
| Sector | Academic/University |
| PI Contribution | We lead the project, formulating the research problems, proposing new algorithms, and conducting experiments. |
| Collaborator Contribution | My partners discuss with us every other week and provide suggestions. They also help with the experiments. |
| Impact | The collaboration produces 3 publications and 2 open-source tools. Publication 1: Polar-express: Efficient and precise formal reachability analysis of neural-network controlled systems Publication 2: Boosting Reinforcement Learning with Strongly Delayed Feedback Through Auxiliary Short Delays Publication 3: State-wise safe reinforcement learning with pixel observations Publication 4: Case Study: Runtime Safety Verification of Neural Network Controlled System Tool 1: Vision-based safe reinforcement learning Tool 2: Time-delay reinforcement learning with auxiliary short delays |
| Start Year | 2024 |
| Description | New Collaboration - Florida |
| Organisation | University of Florida |
| Country | United States |
| Sector | Academic/University |
| PI Contribution | I provided support for using my tool POLAR in the project, technical advice, and revision of the paper draft. |
| Collaborator Contribution | They led the project. |
| Impact | The output is one publication. Paper 1: Bridging Dimensions: Confident Reachability for High-Dimensional Controllers |
| Start Year | 2024 |
| Description | Visit Verimag, France |
| Form Of Engagement Activity | A talk or presentation |
| Part Of Official Scheme? | No |
| Geographic Reach | Local |
| Primary Audience | Schools |
| Results and Impact | Prof. Thao Dang invited me to Verimag, France, to give a talk about my recent research and discuss potential collaborations. Around 20 researchers attended my talk. |
| Year(s) Of Engagement Activity | 2024 |
