1 |
Towards the Next 1000 Languages in Multilingual Machine Translation: Exploring the Synergy Between Supervised and Self-Supervised Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Examining Scaling and Transfer of Language Model Architectures for Machine Translation ...
|
|
|
|
Abstract:
Natural language understanding and generation models follow one of the two dominant architectural paradigms: language models (LMs) that process concatenated sequences in a single stack of layers, and encoder-decoder models (EncDec) that utilize separate layer stacks for input and output processing. In machine translation, EncDec has long been the favoured approach, but with few studies investigating the performance of LMs. In this work, we thoroughly examine the role of several architectural design choices on the performance of LMs on bilingual, (massively) multilingual and zero-shot translation tasks, under systematic variations of data conditions and model sizes. Our results show that: (i) Different LMs have different scaling properties, where architectural differences often have a significant impact on model performance at small scales, but the performance gap narrows as the number of parameters increases, (ii) Several design choices, including causal masking and language-modeling objectives for the ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://arxiv.org/abs/2202.00528 https://dx.doi.org/10.48550/arxiv.2202.00528
|
|
BASE
|
|
Hide details
|
|
5 |
Few-shot Controllable Style Transfer for Low-Resource Multilingual Settings ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Towards Continual Learning for Multilingual Machine Translation via Vocabulary Substitution ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Harnessing Multilinguality in Unsupervised Machine Translation for Rare Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
A Multilingual View of Unsupervised Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Holiday or vacation? The processing of variation in vocabulary across dialects
|
|
|
|
BASE
|
|
Show details
|
|
12 |
World knowledge integration during second language comprehension
|
|
|
|
BASE
|
|
Show details
|
|
13 |
World knowledge integration during second language comprehension
|
|
|
|
BASE
|
|
Show details
|
|
14 |
World knowledge and novel information integration during L2 speech comprehension
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Does the speaker matter? Online processing of semantic and pragmatic information in L2 speech comprehension
|
|
|
|
BASE
|
|
Show details
|
|
|
|