Difference rewards policy gradients (2022)
Attributed to:
Learning to Efficiently Plan in Flexible Distributed Organizations
funded by
EPSRC
Abstract
No abstract provided
Bibliographic Information
Digital Object Identifier: http://dx.doi.org/10.1007/s00521-022-07960-5
Publication URI: http://dx.doi.org/10.1007/s00521-022-07960-5
Type: Journal Article/Review
Parent Publication: Neural Computing and Applications