DE eng

Search in the Catalogues and Directories

Hits 1 – 7 of 7

1
On the Representation Collapse of Sparse Mixture of Experts ...
Chi, Zewen; Dong, Li; Huang, Shaohan. - : arXiv, 2022
BASE
Show details
2
Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training ...
Abstract: Compared to monolingual models, cross-lingual models usually require a more expressive vocabulary to represent all languages adequately. We find that many languages are under-represented in recent cross-lingual language models due to the limited vocabulary capacity. To this end, we propose an algorithm VoCap to determine the desired vocabulary capacity of each language. However, increasing the vocabulary size significantly slows down the pre-training speed. In order to address the issues, we propose k-NN-based target sampling to accelerate the expensive softmax. Our experiments show that the multilingual vocabulary learned with VoCap benefits cross-lingual language model pre-training. Moreover, k-NN-based target sampling mitigates the side-effects of increasing the vocabulary size while achieving comparable performance and faster pre-training speed. The code and the pretrained multilingual vocabularies are available at https://github.com/bozheng-hit/VoCapXLM. ... : EMNLP 2021 ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/2109.07306
https://dx.doi.org/10.48550/arxiv.2109.07306
BASE
Hide details
3
Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task ...
BASE
Show details
4
DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders ...
Ma, Shuming; Dong, Li; Huang, Shaohan. - : arXiv, 2021
BASE
Show details
5
XLM-E: Cross-lingual Language Model Pre-training via ELECTRA ...
Chi, Zewen; Huang, Shaohan; Dong, Li. - : arXiv, 2021
BASE
Show details
6
InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training ...
Chi, Zewen; Dong, Li; Wei, Furu. - : arXiv, 2020
BASE
Show details
7
XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders ...
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
7
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern