1 |
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Cross-lingual semantic specialization via lexical relation induction ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Adversarial propagation and zero-shot cross-lingual transfer of word vector specialization ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Do we really need fully unsupervised cross-lingual embeddings? ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
On the relation between linguistic typology and (limitations of) multilingual language modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Cross-lingual semantic specialization via lexical relation induction
|
|
|
|
Abstract:
Semantic specialization integrates structured linguistic knowledge from external resources (such as lexical relations in WordNet) into pretrained distributional vectors in the form of constraints. However, this technique cannot be leveraged in many languages, because their structured external resources are typically incomplete or non-existent. To bridge this gap, we propose a novel method that transfers specialization from a resource-rich source language (English) to virtually any target language. Our specialization transfer comprises two crucial steps: 1) Inducing noisy constraints in the target language through automatic word translation; and 2) Filtering the noisy constraints via a state-of-the-art relation prediction model trained on the source language constraints. This allows us to specialize any set of distributional vectors in the target language with the refined constraints. We prove the effectiveness of our method through intrinsic word similarity evaluation in 8 languages, and with 3 downstream tasks in 5 languages: lexical simplification, dialog state tracking, and semantic textual similarity. The gains over the previous state-of-art specialization methods are substantial and consistent across languages. Our results also suggest that the transfer method is effective even for lexically distant source-target language pairs. Finally, as a by-product, our method produces lists of WordNet-style lexical relations in resource-poor languages.
|
|
URL: https://doi.org/10.17863/CAM.43734 https://www.repository.cam.ac.uk/handle/1810/296686
|
|
BASE
|
|
Hide details
|
|
9 |
On the relation between linguistic typology and (limitations of) multilingual language modeling
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Adversarial propagation and zero-shot cross-lingual transfer of word vector specialization
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Do we really need fully unsupervised cross-lingual embeddings?
|
|
Vulić, I; Glavaš, G; Reichart, R. - : EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2020
|
|
BASE
|
|
Show details
|
|
12 |
Towards zero-shot language modeling
|
|
Ponti, Edoardo; Vulić, I; Cotterell, R. - : EMNLP-IJCNLP 2019 - 2019 Conference on Empirical Methods in Natural Language Processing and 9th International Joint Conference on Natural Language Processing, Proceedings of the Conference, 2020
|
|
BASE
|
|
Show details
|
|
14 |
Zero-shot language transfer for cross-lingual sentence retrieval using bidirectional attention model ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Learning unsupervised multilingual word embeddings with incremental multilingual hubs ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Specializing distributional vectors of allwords for lexical entailment ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Specializing distributional vectors of allwords for lexical entailment
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Investigating cross-lingual alignment methods for contextualized embeddings with Token-level evaluation
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Learning unsupervised multilingual word embeddings with incremental multilingual hubs
|
|
Heyman, G; Verreet, B; Vulić, I. - : NAACL HLT 2019 - 2019 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies - Proceedings of the Conference, 2019
|
|
BASE
|
|
Show details
|
|
|
|