2 |
A Call for More Rigor in Unsupervised Cross-lingual Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
On the Cross-lingual Transferability of Monolingual Representations ...
|
|
|
|
Abstract:
State-of-the-art unsupervised multilingual models (e.g., multilingual BERT) have been shown to generalize in a zero-shot cross-lingual setting. This generalization ability has been attributed to the use of a shared subword vocabulary and joint training across multiple languages giving rise to deep multilingual abstractions. We evaluate this hypothesis by designing an alternative approach that transfers a monolingual model to new languages at the lexical level. More concretely, we first train a transformer-based masked language model on one language, and transfer it to a new language by learning a new embedding matrix with the same masked language modeling objective, freezing parameters of all other layers. This approach does not rely on a shared vocabulary or joint training. However, we show that it is competitive with multilingual BERT on standard cross-lingual classification benchmarks and on a new Cross-lingual Question Answering Dataset (XQuAD). Our results contradict common beliefs of the basis of the ... : ACL 2020 ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://dx.doi.org/10.48550/arxiv.1910.11856 https://arxiv.org/abs/1910.11856
|
|
BASE
|
|
Hide details
|
|
4 |
Learning Word Representations with Hierarchical Sparse Coding ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Learning Word Representations with Hierarchical Sparse Coding ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Predicting a Scientific Community’s Response to an Article ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Predicting a Scientific Community’s Response to an Article ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
|
|
|
|
In: DTIC (2010)
|
|
BASE
|
|
Show details
|
|
|
|