1 |
Machine Learning approaches for Topic and Sentiment Analysis in multilingual opinions and low-resource languages: From English to Guarani
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Investigating alignment interpretability for low-resource NMT
|
|
|
|
In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-03139744 ; Machine Translation, Springer Verlag, 2021, ⟨10.1007/s10590-020-09254-w⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
A speech-enabled fixed-phrase translator for healthcare accessibility
|
|
|
|
In: Proceedings of the 1st Workshop on NLP for Positive Impact (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Modeling phones, keywords, topics and intents in spoken languages
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Multilingual offensive language identification for low-resource languages
|
|
|
|
In: 21 ; 1 (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Terminology-aware sentence mining for NMT domain adaptation: ADAPT’s submission to the Adap-MT 2020 English-to-Hindi AI translation shared task
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Moslem, Yasmin orcid:0000-0003-4595-6877 and Way, Andy orcid:0000-0001-5736-5930 (2020) Terminology-aware sentence mining for NMT domain adaptation: ADAPT’s submission to the Adap-MT 2020 English-to-Hindi AI translation shared task. In: Workshop on Low Resource Domain Adaptation for Indic Machine Translation (Adap-MT 2020), 18-21 Dec 2020, Patna, India (Online). (In Press) (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Building a Universal Dependencies Treebank for Occitan
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892715 ; 12th Language Resources and Evaluation Conference, May 2020, Marseille, France. pp.2932-2939 (2020)
|
|
BASE
|
|
Show details
|
|
9 |
Findings of the LoResMT 2020 shared task on zero-shot for low-resource languages
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Advanced Convolutional Neural Network-Based Hybrid Acoustic Models for Low-Resource Speech Recognition
|
|
|
|
In: Computers ; Volume 9 ; Issue 2 (2020)
|
|
BASE
|
|
Show details
|
|
11 |
Towards Language Service Creation and Customization for Low-Resource Languages
|
|
|
|
In: Information ; Volume 11 ; Issue 2 (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
|
|
|
|
In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02193867 ; Interspeech 2019, Sep 2019, Graz, Austria (2019)
|
|
BASE
|
|
Show details
|
|
13 |
Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study
|
|
|
|
In: Barry, James orcid:0000-0003-3051-585X , Wagner, Joachim orcid:0000-0002-8290-3849 and Foster, Jennifer orcid:0000-0002-7789-4853 (2019) Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study. In: The 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), 3 - 5 Nov 2019, Hong Kong, China. ISBN 978-1-950737-78-9 (2019)
|
|
BASE
|
|
Show details
|
|
14 |
Unsupervised word discovery for computational language documentation ; Découverte non-supervisée de mots pour outiller la linguistique de terrain
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-02286425 ; Artificial Intelligence [cs.AI]. Université Paris Saclay (COmUE), 2019. English. ⟨NNT : 2019SACLS062⟩ (2019)
|
|
BASE
|
|
Show details
|
|
15 |
Crowdsourcing the Paldaruo Speech Corpus of Welsh for Speech Technology
|
|
|
|
In: Information ; Volume 10 ; Issue 8 (2019)
|
|
BASE
|
|
Show details
|
|
16 |
Constructing Uyghur Commonsense Knowledge Base by Knowledge Projection
|
|
|
|
In: Applied Sciences ; Volume 9 ; Issue 16 (2019)
|
|
BASE
|
|
Show details
|
|
17 |
The Usefulness of Imperfect Speech Data for ASR Development in Low-Resource Languages
|
|
|
|
In: Information ; Volume 10 ; Issue 9 (2019)
|
|
BASE
|
|
Show details
|
|
18 |
Improving Semantic Similarity with Cross-Lingual Resources: A Study in Bangla—A Low Resourced Language
|
|
|
|
In: Informatics ; Volume 6 ; Issue 2 (2019)
|
|
BASE
|
|
Show details
|
|
19 |
Adapting NMT to caption translation in Wikimedia Commons for low-resource languages ; Adaptando NMT a la traducción de pies de imagen en Wikimedia Commons para idiomas con pocos recursos
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Filtering of Noisy Parallel Corpora Based on Hypothesis Generation
|
|
|
|
Abstract:
[EN] The filtering task of noisy parallel corpora in WMT2019 aims to challenge participants to create filtering methods to be useful for training machine translation systems. In this work, we introduce a noisy parallel corpora filtering system based on generating hypotheses by means of a translation model. We train translation models in both language pairs: Nepali English and Sinhala English using provided parallel corpora. To create the best possible translation model, we first join all provided parallel corpora (Nepali, Sinhala and Hindi to English) and after that, we applied bilingual cross-entropy selection for both language pairs (Nepali English and Sinhala English). Once the translation models are trained, we translate the noisy corpora and generate a hypothesis for each sentence pair. We compute the smoothed BLEU score between the target sentence and generated hypothesis. In addition, we apply several rules to discard very noisy or inadequate sentences which can lower the translation score. These heuristics are based on sentence length, source and target similarity and source language detection. We compare our results with the baseline published on the shared task website, which uses the Zipporah model, over which we achieve significant improvements in one of the conditions in the shared task. The designed filtering system is domain independent and all experiments are conducted using neural machine translation. ; Work partially supported by MINECO under grant DI-15-08169 and by Sciling under its R+D programme. The authors would like to thank NVIDIA for their donation of Titan Xp GPU that allowed to conduct this research. ; Parcheta, Z.; Sanchis Trilles, G.; Casacuberta Nolla, F. (2019). Filtering of Noisy Parallel Corpora Based on Hypothesis Generation. The Association for Computational Linguistics. 284-290. https://doi.org/10.18653/v1/W19-5439 ; S ; 284 ; 290
|
|
Keyword:
Corpus filtering; Hypothesis Generation; LENGUAJES Y SISTEMAS INFORMATICOS; Low-resource languages; Noisy corpora
|
|
URL: http://hdl.handle.net/10251/180620 https://doi.org/10.18653/v1/W19-5439
|
|
BASE
|
|
Hide details
|
|
|
|