2 |
Sentence Bottleneck Autoencoders from Transformer Language Models ...
|
|
|
|
Abstract:
Anthology paper link: https://aclanthology.org/2021.emnlp-main.137/ Abstract: Representation learning for text via pretraining a language model on a large corpus has become a standard starting point for building NLP systems. This approach stands in contrast to autoencoders, also trained on raw text, but with the objective of learning to encode each input as a vector that allows full reconstruction. Autoencoders are attractive because of their latent space structure and generative properties. We therefore explore the construction of a sentence-level autoencoder from a pretrained, frozen transformer language model. We adapt the masked language modeling objective as a generative, denoising one, while only training a sentence bottleneck and a single-layer modified transformer decoder. We demonstrate that the sentence representations discovered by our model achieve better quality than previous methods that extract representations from pretrained transformers on text similarity tasks, style transfer (an example of ...
|
|
Keyword:
Computational Linguistics; Language Models; Machine Learning; Machine Learning and Data Mining; Natural Language Processing
|
|
URL: https://underline.io/lecture/37876-sentence-bottleneck-autoencoders-from-transformer-language-models https://dx.doi.org/10.48448/k600-qa97
|
|
BASE
|
|
Hide details
|
|
3 |
Grounded Compositional Outputs for Adaptive Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Regularization Advantages of Multilingual Neural Language Models for Low Resource Domains ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
STACKED NEURAL NETWORKS WITH PARAMETER SHARING FOR MULTILINGUAL LANGUAGE MODELING
|
|
|
|
In: http://infoscience.epfl.ch/record/272000 (2019)
|
|
BASE
|
|
Show details
|
|
6 |
GILE: A Generalized Input-Label Embedding for Text Classification
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 7, Pp 139-155 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
7 |
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Integrating Weakly Supervised Word Sense Disambiguation into Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Multilingual Hierarchical Attention Networks for Document Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Multilingual Hierarchical Attention Networks for Document Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Multilingual Hierarchical Attention Networks for Document Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Self-Attentive Residual Decoder for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Sense-Aware Statistical Machine Translation using Adaptive Context-Dependent Clustering ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Sense-Aware Statistical Machine Translation using Adaptive Context-Dependent Clustering ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Multilingual Hierarchical Attention Networks for Document Classification
|
|
|
|
In: http://infoscience.epfl.ch/record/231134 (2017)
|
|
BASE
|
|
Show details
|
|
19 |
Cross-lingual Transfer for News Article Labeling: Benchmarking Statistical and Neural Models
|
|
|
|
In: http://infoscience.epfl.ch/record/231130 (2017)
|
|
BASE
|
|
Show details
|
|
20 |
Evaluating Attention Networks for Anaphora Resolution
|
|
|
|
In: http://infoscience.epfl.ch/record/231846 (2017)
|
|
BASE
|
|
Show details
|
|
|
|