1 |
WANLI: Worker and AI Collaboration for Natural Language Inference Dataset Creation ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Annotators with Attitudes: How Annotator Beliefs And Identities Bias Toxic Language Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Specializing Multilingual Language Models: An Empirical Study ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Finetuning Pretrained Transformers into RNNs ...
|
|
The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Chen, Weizhu; Ilharco, Gabriel; Kasai, Jungo; Mao, Yi; Pappas, Nikolaos; Peng, Hao; Smith, Noah; Yogatama, Dani; Zhang, Yizhe. - : Underline Science Inc., 2021
|
|
Abstract:
Anthology paper link: https://aclanthology.org/2021.emnlp-main.830/ Abstract: Transformers have outperformed recurrent neural networks (RNNs) in natural language generation. This comes with a significant computational overhead, as the attention mechanism scales with a quadratic complexity in sequence length. Efficient transformer variants have received increasing interest from recent works. Among them, a linear-complexity recurrent variant has proven well suited for autoregressive generation. It approximates the softmax attention with randomized or heuristic feature maps, but can be difficult to train or yield suboptimal accuracy. This work aims to convert a pretrained transformer into its efficient recurrent counterpart, improving the efficiency while retaining the accuracy. Specifically, we propose a swap-then-finetune procedure: in an off-the-shelf pretrained transformer, we replace the softmax attention with its linear-complexity recurrent alternative and then finetune. With a learned feature map, our ...
|
|
Keyword:
Computational Linguistics; Machine Learning; Machine Learning and Data Mining; Natural Language Processing; Neural Network
|
|
URL: https://dx.doi.org/10.48448/w4sb-sz82 https://underline.io/lecture/37314-finetuning-pretrained-transformers-into-rnns
|
|
BASE
|
|
Hide details
|
|
8 |
Sentence Bottleneck Autoencoders from Transformer Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
All That's 'Human' Is Not Gold: Evaluating Human Evaluation of Generated Text ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Measuring Association Between Labels and Free-Text Rationales ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Promoting Graph Awareness in Linearized Graph-to-Text Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Shortformer: Better Language Modeling using Shorter Inputs ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
DExperts: Decoding-Time Controlled Text Generation with Experts and Anti-Experts ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Specializing Multilingual Language Models: An Empirical Study ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Challenges in Automated Debiasing for Toxic Language Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
NeuroLogic A*esque Decoding: Constrained Text Generation with Lookahead Heuristics ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Competency Problems: On Finding and Removing Artifacts in Language Data ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|