1 |
Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02975786 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2020, 46 (4), pp.847-897 ; https://direct.mit.edu/coli/article/46/4/847/97326/Multi-SimLex-A-Large-Scale-Evaluation-of (2020)
|
|
BASE
|
|
Show details
|
|
2 |
Span-ConveRT: Few-shot Span Extraction for Dialog with Pretrained Conversational Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Multidirectional Associative Optimization of Function-Specific Word Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Emergent Communication Pretraining for Few-Shot Machine Translation ...
|
|
|
|
Abstract:
While state-of-the-art models that rely upon massively multilingual pretrained encoders achieve sample efficiency in downstream applications, they still require abundant amounts of unlabelled text. Nevertheless, most of the world's languages lack such resources. Hence, we investigate a more radical form of unsupervised knowledge transfer in the absence of linguistic data. In particular, for the first time we pretrain neural networks via emergent communication from referential games. Our key assumption is that grounding communication on images---as a crude approximation of real-world environments---inductively biases the model towards learning natural languages. On the one hand, we show that this substantially benefits machine translation in few-shot settings. On the other hand, this also provides an extrinsic evaluation protocol to probe the properties of emergent languages ex vitro. Intuitively, the closer they are to natural languages, the higher the gains from pretraining on them should be. For instance, ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://dx.doi.org/10.48550/arxiv.2011.00890 https://arxiv.org/abs/2011.00890
|
|
BASE
|
|
Hide details
|
|
7 |
Orthogonal Language and Task Adapters in Zero-Shot Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
MAD-X: An Adapter-Based Framework for Multi-Task Cross-Lingual Transfer ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
XHate-999: Analyzing and Detecting Abusive Language Across Domains and Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Emergent Communication Pretraining for Few-Shot Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
From Zero to Hero: On the Limitations of Zero-Shot Cross-Lingual Transfer with Multilingual Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
XCOPA: A Multilingual Dataset for Causal Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Emergent Communication Pretraining for Few-Shot Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Manual Clustering and Spatial Arrangement of Verbs for Multilingual Evaluation and Typology Analysis ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|