Interpolated Policy Gradient: Merging On-Policy and Off-Policy Gradient Estimation for Deep Reinforcement Learning (2017)
Attributed to:
Autonomous behaviour and learning in an uncertain world
funded by
EPSRC
Abstract
No abstract provided
Bibliographic Information
Digital Object Identifier: http://dx.doi.org/10.48550/arxiv.1706.00387
Publication URI: http://arxiv.org/abs/1706.00387v1
Type: Journal Article/Review
Parent Publication: arXiv e-prints