1 |
Machine Learning approaches for Topic and Sentiment Analysis in multilingual opinions and low-resource languages: From English to Guarani
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Investigating alignment interpretability for low-resource NMT
|
|
|
|
In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-03139744 ; Machine Translation, Springer Verlag, 2021, ⟨10.1007/s10590-020-09254-w⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
A speech-enabled fixed-phrase translator for healthcare accessibility
|
|
|
|
In: Proceedings of the 1st Workshop on NLP for Positive Impact (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Modeling phones, keywords, topics and intents in spoken languages
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Multilingual offensive language identification for low-resource languages
|
|
|
|
In: 21 ; 1 (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Terminology-aware sentence mining for NMT domain adaptation: ADAPT’s submission to the Adap-MT 2020 English-to-Hindi AI translation shared task
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Moslem, Yasmin orcid:0000-0003-4595-6877 and Way, Andy orcid:0000-0001-5736-5930 (2020) Terminology-aware sentence mining for NMT domain adaptation: ADAPT’s submission to the Adap-MT 2020 English-to-Hindi AI translation shared task. In: Workshop on Low Resource Domain Adaptation for Indic Machine Translation (Adap-MT 2020), 18-21 Dec 2020, Patna, India (Online). (In Press) (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Building a Universal Dependencies Treebank for Occitan
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; 12th Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-02892715 ; 12th Language Resources and Evaluation Conference, May 2020, Marseille, France. pp.2932-2939 (2020)
|
|
BASE
|
|
Show details
|
|
9 |
Findings of the LoResMT 2020 shared task on zero-shot for low-resource languages
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Advanced Convolutional Neural Network-Based Hybrid Acoustic Models for Low-Resource Speech Recognition
|
|
|
|
In: Computers ; Volume 9 ; Issue 2 (2020)
|
|
BASE
|
|
Show details
|
|
11 |
Towards Language Service Creation and Customization for Low-Resource Languages
|
|
|
|
In: Information ; Volume 11 ; Issue 2 (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
|
|
|
|
In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02193867 ; Interspeech 2019, Sep 2019, Graz, Austria (2019)
|
|
BASE
|
|
Show details
|
|
13 |
Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study
|
|
|
|
In: Barry, James orcid:0000-0003-3051-585X , Wagner, Joachim orcid:0000-0002-8290-3849 and Foster, Jennifer orcid:0000-0002-7789-4853 (2019) Cross-lingual parsing with polyglot training and multi-treebank learning: a Faroese case study. In: The 2nd Workshop on Deep Learning Approaches for Low-Resource NLP (DeepLo 2019), 3 - 5 Nov 2019, Hong Kong, China. ISBN 978-1-950737-78-9 (2019)
|
|
BASE
|
|
Show details
|
|
14 |
Unsupervised word discovery for computational language documentation ; Découverte non-supervisée de mots pour outiller la linguistique de terrain
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-02286425 ; Artificial Intelligence [cs.AI]. Université Paris Saclay (COmUE), 2019. English. ⟨NNT : 2019SACLS062⟩ (2019)
|
|
BASE
|
|
Show details
|
|
15 |
Crowdsourcing the Paldaruo Speech Corpus of Welsh for Speech Technology
|
|
|
|
In: Information ; Volume 10 ; Issue 8 (2019)
|
|
Abstract:
Collecting speech data for a low-resource language is challenging when funding and resources are limited. This paper describes the process of designing, creating and using the Paldaruo Speech Corpus for developing speech technology for Welsh. Specifically, this paper focuses on the crowdsourcing of data using an app on smartphones and mobile devices, allowing speakers from across Wales to contribute. We discuss the development of reading prompts: isolated words and full sentences, as well as the metadata collected from contributors. We also provide background on the design of the Paldaruo App as well as the main uses for the corpus and its availability and licensing. The corpus was designed for the development of speech recognition for Welsh and has been used to create a number of other resources. These methods can be extended to other languages, and suggestions for other low-resource languages are discussed.
|
|
Keyword:
corpus; linguistic diversity; low-resource languages; speech recognition; speech technology
|
|
URL: https://doi.org/10.3390/info10080247
|
|
BASE
|
|
Hide details
|
|
16 |
Constructing Uyghur Commonsense Knowledge Base by Knowledge Projection
|
|
|
|
In: Applied Sciences ; Volume 9 ; Issue 16 (2019)
|
|
BASE
|
|
Show details
|
|
17 |
The Usefulness of Imperfect Speech Data for ASR Development in Low-Resource Languages
|
|
|
|
In: Information ; Volume 10 ; Issue 9 (2019)
|
|
BASE
|
|
Show details
|
|
18 |
Improving Semantic Similarity with Cross-Lingual Resources: A Study in Bangla—A Low Resourced Language
|
|
|
|
In: Informatics ; Volume 6 ; Issue 2 (2019)
|
|
BASE
|
|
Show details
|
|
19 |
Adapting NMT to caption translation in Wikimedia Commons for low-resource languages ; Adaptando NMT a la traducción de pies de imagen en Wikimedia Commons para idiomas con pocos recursos
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Filtering of Noisy Parallel Corpora Based on Hypothesis Generation
|
|
|
|
BASE
|
|
Show details
|
|
|
|