Q-Prop: Sample-Efficient Policy Gradient with An Off-Policy Critic (2016)
Attributed to:
Unifying audio signal processing and machine learning: a fundamental framework for machine hearing
funded by
EPSRC
Abstract
No abstract provided
Bibliographic Information
Digital Object Identifier: http://dx.doi.org/10.48550/arxiv.1611.02247
Publication URI: https://arxiv.org/pdf/1611.02247.pdf
Type: Journal Article/Review
Parent Publication: arXiv e-prints