Group Robust Preference Optimization in Reward-free RLHF (2024)
Attributed to:
Robust and Efficient Model-based Reinforcement Learning
funded by
EPSRC
Abstract
No abstract provided
Bibliographic Information
Type: Conference/Paper/Proceeding/Abstract