An Embarrassingly Simple Method to Mitigate Undesirable Properties of Pretrained Language Model Tokenizers (2022)
Attributed to:
Exaggeration, cohesion, and fragmentation in on-line forums
funded by
EPSRC
Abstract
No abstract provided
Bibliographic Information
Digital Object Identifier: http://dx.doi.org/10.18653/v1/2022.acl-short.43
Publication URI: http://dx.doi.org/10.18653/v1/2022.acl-short.43
Type: Conference/Paper/Proceeding/Abstract