1 |
Investigating language impact in bilingual approaches for computational language documentation
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Empirical evaluation of sequence-to-sequence models for word discovery in low-resource settings
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Unwritten languages demand attention too! Word discovery with encoder-decoder models
|
|
|
|
BASE
|
|
Show details
|
|
5 |
A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments ...
|
|
Godard, P.; Adda, G.; Adda-Decker, M.; Benjumea, J.; Besacier, L.; Cooper-Leavitt, J.; Kouarata, G-N.; Lamel, L.; Maynard, H.; Mueller, M.; Rialland, A.; Stueker, S.; Yvon, F.; Zanon-Boito, M.. - : arXiv, 2017
|
|
Abstract:
Most speech and language technologies are trained with massive amounts of speech and text information. However, most of the world languages do not have such resources or stable orthography. Systems constructed under these almost zero resource conditions are not only promising for speech technology but also for computational language documentation. The goal of computational language documentation is to help field linguists to (semi-)automatically analyze and annotate audio recordings of endangered and unwritten languages. Example tasks are automatic phoneme discovery or lexicon discovery from the speech signal. This paper presents a speech corpus collected during a realistic language documentation process. It is made up of 5k speech utterances in Mboshi (Bantu C25) aligned to French text translations. Speech transcriptions are also made available: they correspond to a non-standard graphemic form close to the language phonology. We present how the data was collected, cleaned and processed and we illustrate its ... : accepted to LREC 2018 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.1710.03501 https://arxiv.org/abs/1710.03501
|
|
BASE
|
|
Hide details
|
|
6 |
The Nespole! VoIP multilingual corpora in tourism and medical domains
|
|
|
|
BASE
|
|
Show details
|
|
|
|