When Can Self-Attention Be Replaced by Feed Forward Layers? (2020)

First Author: Zhang S
Attributed to:  SpeechWave funded by EPSRC

Abstract

No abstract provided

Bibliographic Information

Digital Object Identifier: http://dx.doi.org/10.48550/arxiv.2005.13895

Publication URI: https://arxiv.org/abs/2005.13895

Type: Preprint