Making asynchronous stochastic gradient descent work for transformers (2019)

First Author: Aji A.F.

Abstract

No abstract provided

Bibliographic Information

Type: Other

Parent Publication: EMNLP-IJCNLP 2019 - Proceedings of the 3rd Workshop on Neural Generation and Translation