Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 7 of 7

1	On the Representation Collapse of Sparse Mixture of Experts ...
	Chi, Zewen; Dong, Li; Huang, Shaohan. - : arXiv, 2022
	BASE
	Show details

2	Allocating Large Vocabulary Capacity for Cross-lingual Language Model Pre-training ...
	Zheng, Bo; Dong, Li; Huang, Shaohan; Singhal, Saksham; Che, Wanxiang; Liu, Ting; Song, Xia; Wei, Furu. - : arXiv, 2021
	Abstract: Compared to monolingual models, cross-lingual models usually require a more expressive vocabulary to represent all languages adequately. We find that many languages are under-represented in recent cross-lingual language models due to the limited vocabulary capacity. To this end, we propose an algorithm VoCap to determine the desired vocabulary capacity of each language. However, increasing the vocabulary size significantly slows down the pre-training speed. In order to address the issues, we propose k-NN-based target sampling to accelerate the expensive softmax. Our experiments show that the multilingual vocabulary learned with VoCap benefits cross-lingual language model pre-training. Moreover, k-NN-based target sampling mitigates the side-effects of increasing the vocabulary size while achieving comparable performance and faster pre-training speed. The code and the pretrained multilingual vocabularies are available at https://github.com/bozheng-hit/VoCapXLM. ... : EMNLP 2021 ...
	Keyword: Computation and Language cs.CL; FOS Computer and information sciences
	URL: https://arxiv.org/abs/2109.07306 https://dx.doi.org/10.48550/arxiv.2109.07306
	BASE
	Hide details

3	Multilingual Machine Translation Systems from Microsoft for WMT21 Shared Task ...
	Yang, Jian; Ma, Shuming; Huang, Haoyang. - : arXiv, 2021
	BASE
	Show details

4	DeltaLM: Encoder-Decoder Pre-training for Language Generation and Translation by Augmenting Pretrained Multilingual Encoders ...
	Ma, Shuming; Dong, Li; Huang, Shaohan. - : arXiv, 2021
	BASE
	Show details

5	XLM-E: Cross-lingual Language Model Pre-training via ELECTRA ...
	Chi, Zewen; Huang, Shaohan; Dong, Li. - : arXiv, 2021
	BASE
	Show details

6	InfoXLM: An Information-Theoretic Framework for Cross-Lingual Language Model Pre-Training ...
	Chi, Zewen; Dong, Li; Wei, Furu. - : arXiv, 2020
	BASE
	Show details

7	XLM-T: Scaling up Multilingual Machine Translation with Pretrained Cross-lingual Transformer Encoders ...
	Ma, Shuming; Yang, Jian; Huang, Haoyang. - : arXiv, 2020
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern