1 |
Everything Is All It Takes: A Multipronged Strategy for Zero-Shot Cross-Lingual Information Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Fine-tuning Encoders for Improved Monolingual and Zero-shot Polylingual Neural Topic Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Do Explicit Alignments Robustly Improve Multilingual Encoders? ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Sources of Transfer in Multilingual Named Entity Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Cross-Lingual Transfer in Zero-Shot Cross-Language Entity Linking ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Are All Languages Created Equal in Multilingual BERT? ...
|
|
|
|
Abstract:
Multilingual BERT (mBERT) trained on 104 languages has shown surprisingly good cross-lingual performance on several NLP tasks, even without explicit cross-lingual signals. However, these evaluations have focused on cross-lingual transfer with high-resource languages, covering only a third of the languages covered by mBERT. We explore how mBERT performs on a much wider set of languages, focusing on the quality of representation for low-resource languages, measured by within-language performance. We consider three tasks: Named Entity Recognition (99 languages), Part-of-speech Tagging, and Dependency Parsing (54 languages each). mBERT does better than or comparable to baselines on high resource languages but does much worse for low resource languages. Furthermore, monolingual BERT models for these languages do even worse. Paired with similar languages, the performance gap between monolingual BERT and mBERT can be narrowed. We find that better models for low resource languages require more efficient pretraining ... : RepL4NLP Workshop 2020 (Best Long Paper) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2005.09093 https://dx.doi.org/10.48550/arxiv.2005.09093
|
|
BASE
|
|
Hide details
|
|
10 |
Learning unsupervised contextual representations for medical synonym discovery
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Improving Named Entity Recognition for Chinese Social Media with Word Segmentation Representation Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Improved Relation Extraction with Feature-Rich Compositional Embedding Models ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Annotating named entities in Twitter data with crowdsourcing ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Reading the Markets: Forecasting Public Opinion of Political Candidates by News Analysis ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Further Results and Analysis of Icelandic Part of Speech Tagging
|
|
|
|
In: Technical Reports (CIS) (2008)
|
|
BASE
|
|
Show details
|
|
|
|