1 |
Speech technology for unwritten languages
|
|
|
|
In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.inria.fr/hal-02480675 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2020, ⟨10.1109/TASLP.2020.2973896⟩ (2020)
|
|
BASE
|
|
Show details
|
|
2 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention
|
|
|
|
In: International Workshop on Spoken Language Translation ; https://hal.archives-ouvertes.fr/hal-02343206 ; International Workshop on Spoken Language Translation, Nov 2019, Hong-Kong, China (2019)
|
|
BASE
|
|
Show details
|
|
3 |
Unsupervised word discovery for computational language documentation ; Découverte non-supervisée de mots pour outiller la linguistique de terrain
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-02286425 ; Artificial Intelligence [cs.AI]. Université Paris Saclay (COmUE), 2019. English. ⟨NNT : 2019SACLS062⟩ (2019)
|
|
BASE
|
|
Show details
|
|
4 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Unsupervised Word Segmentation from Speech with Attention
|
|
|
|
In: Interspeech 2018 ; https://hal.archives-ouvertes.fr/hal-01818092 ; Interspeech 2018, Sep 2018, Hyderabad, India (2018)
|
|
BASE
|
|
Show details
|
|
8 |
Bayesian models for unit discovery on a very low resource language
|
|
|
|
In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-01709589 ; IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
9 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the “Speaking rosetta” JSALT 2017 workshop
|
|
|
|
In: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01709578 ; ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
10 |
Unsupervised Word Segmentation: does tone matter ?
|
|
|
|
In: International Conference on Intelligent Text Processing and Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01910756 ; International Conference on Intelligent Text Processing and Computational Linguistics, Mar 2018, Hanoï, Vietnam (2018)
|
|
BASE
|
|
Show details
|
|
11 |
Adaptor Grammars for the Linguist: Word Segmentation Experiments for Very Low-Resource Languages
|
|
|
|
In: Workshop on Computational Research in Phonetics, Phonology, and Morphology ; https://hal.archives-ouvertes.fr/hal-01910757 ; Workshop on Computational Research in Phonetics, Phonology, and Morphology, Oct 2018, Bruxelles, Belgium. pp.32 - 42, ⟨10.18653/v1/P17⟩ (2018)
|
|
BASE
|
|
Show details
|
|
12 |
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
|
|
|
|
In: 11th edition of the Language Resources and Evaluation Conference (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01710043 ; 11th edition of the Language Resources and Evaluation Conference (LREC 2018), ELRA, May 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
13 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Unsupervised Word Segmentation from Speech with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
BULB: Breaking the Unwritten Language Barrier
|
|
Adda, Gilles; Adda-Decker, Martine; Ambouroue, Odette; Besacier, Laurent; Blachon, David; Maynard, Hélène; Godard, Pierre; Hamlaoui, Fatima; Idiatov, Dmitry; Kouarata, Guy-Noël; Lamel, Lori; Makasso, Emmanuel-Moselly; Mariani, Joseph,; Rialland, Annie; Stücker, Sebastian; Van De Velde, Mark; Gauthier, Elodie; Yvon, François; Zerbian, Sabine
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
Abstract:
The project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aimsat supporting linguists in documenting unwritten languages. In order to achieve this we develop tools tailored to the needs ofdocumentary linguists by building upon technology and expertise from the area of natural language processing, most prominentlyautomatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourcedAfrican languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps:1) Collection of a large corpus of speech (100h per language) at a reasonable cost. For this we use standard mobile devices and adedicated software—Lig-Aikuma. After initial recording, the data is re-spoken by a reference speaker to enhance the signal qualityand orally translated into French.2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognizedBantu phonemes and French words will then be automatically aligned.3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists’ needs and technology’scapabilities.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; Endangered Languages; Low Resource Language
|
|
URL: https://hal.archives-ouvertes.fr/hal-01836496 https://doi.org/10.1016/j.procs.2016.04.023
|
|
BASE
|
|
Hide details
|
|
16 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
17 |
Preliminary Experiments on Unsupervised Word Discovery in Mboshi
|
|
|
|
In: Interspeech 2016 proceedings ; Interspeech 2016 ; https://hal.archives-ouvertes.fr/hal-01350119 ; Interspeech 2016, Sep 2016, San-Francisco, United States (2016)
|
|
BASE
|
|
Show details
|
|
18 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
19 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
20 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
|
|