81 |
Bilingual Lexicon Induction with Semi-supervision in Non-Isometric Embedding Spaces ...
|
|
|
|
Abstract:
Recent work on bilingual lexicon induction (BLI) has frequently depended either on aligned bilingual lexicons or on distribution matching, often with an assumption about the isometry of the two spaces. We propose a technique to quantitatively estimate this assumption of the isometry between two embedding spaces and empirically show that this assumption weakens as the languages in question become increasingly etymologically distant. We then propose Bilingual Lexicon Induction with Semi-Supervision (BLISS) --- a semi-supervised approach that relaxes the isometric assumption while leveraging both limited aligned bilingual lexicons and a larger set of unaligned word embeddings, as well as a novel hubness filtering technique. Our proposed method obtains state of the art results on 15 of 18 language pairs on the MUSE dataset, and does particularly well when the embedding spaces don't appear to be isometric. In addition, we also show that adding supervision stabilizes the learning procedure, and is effective even ... : ACL 2019 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://arxiv.org/abs/1908.06625 https://dx.doi.org/10.48550/arxiv.1908.06625
|
|
BASE
|
|
Hide details
|
|
82 |
Generalization in Generation: A closer look at Exposure Bias
|
|
|
|
In: Proceedings of the 3rd Workshop on Neural Generation and Translation (2019)
|
|
BASE
|
|
Show details
|
|
83 |
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 7, Pp 313-325 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
84 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the “Speaking rosetta” JSALT 2017 workshop
|
|
|
|
In: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01709578 ; ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
85 |
Evaluating phonemic transcription of low-resource tonal languages for language documentation
|
|
|
|
In: LREC 2018 (Language Resources and Evaluation Conference) ; https://halshs.archives-ouvertes.fr/halshs-01709648 ; LREC 2018 (Language Resources and Evaluation Conference), May 2018, Miyazaki, Japan. pp.3356-3365 (2018)
|
|
BASE
|
|
Show details
|
|
86 |
Integrating automatic transcription into the language documentation workflow: Experiments with Na data and the Persephone toolkit
|
|
|
|
In: ISSN: 1934-5275 ; EISSN: 1934-5275 ; Language Documentation & Conservation ; https://halshs.archives-ouvertes.fr/halshs-01841979 ; Language Documentation & Conservation, University of Hawaiʻi Press 2018, 12, pp.393-429 ; hdl.handle.net/10125/24793 (2018)
|
|
BASE
|
|
Show details
|
|
87 |
Adapting Word Embeddings to New Languages with Morphological and Phonological Subword Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
88 |
Parameter Sharing Methods for Multilingual Self-Attentional Translation Models ...
|
|
|
|
BASE
|
|
Show details
|
|
89 |
Towards a General-Purpose Linguistic Annotation Backend ...
|
|
|
|
BASE
|
|
Show details
|
|
90 |
Multi-Source Neural Machine Translation with Missing Data ...
|
|
|
|
BASE
|
|
Show details
|
|
91 |
Rapid Adaptation of Neural Machine Translation to New Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
92 |
Findings of the Second Workshop on Neural Machine Translation and Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
94 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop ...
|
|
|
|
BASE
|
|
Show details
|
|
95 |
Attentive Interaction Model: Modeling Changes in View in Argumentation ...
|
|
|
|
BASE
|
|
Show details
|
|
96 |
Neural Cross-Lingual Named Entity Recognition with Minimal Resources ...
|
|
|
|
BASE
|
|
Show details
|
|
98 |
Zero-shot Neural Transfer for Cross-lingual Entity Linking ...
|
|
|
|
BASE
|
|
Show details
|
|
99 |
Neural Factor Graph Models for Cross-lingual Morphological Tagging ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|