Q-PrOP: Sample-efficient policy gradient with an off-policy critic (2017)
Attributed to:
Machine Learning for Hearing Aids: Intelligent Processing and Fitting
funded by
EPSRC
Abstract
No abstract provided
Bibliographic Information
Type: Other
Parent Publication: 5th International Conference on Learning Representations, ICLR 2017 - Conference Track Proceedings