1 |
Adaptor Grammars for the Linguist: Word Segmentation Experiments for Very Low-Resource Languages
|
|
|
|
In: Workshop on Computational Research in Phonetics, Phonology, and Morphology ; https://hal.archives-ouvertes.fr/hal-01910757 ; Workshop on Computational Research in Phonetics, Phonology, and Morphology, Oct 2018, Bruxelles, Belgium. pp.32 - 42, ⟨10.18653/v1/P17⟩ (2018)
|
|
BASE
|
|
Show details
|
|
2 |
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
|
|
|
|
In: 11th edition of the Language Resources and Evaluation Conference (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01710043 ; 11th edition of the Language Resources and Evaluation Conference (LREC 2018), ELRA, May 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
3 |
A corpus based study of morpheme deletion in a low resourced language: A case study for Embosi
|
|
|
|
In: Annual Meeting of the Linguistic Society of America ; https://hal.archives-ouvertes.fr/hal-01837164 ; Annual Meeting of the Linguistic Society of America, Jan 2018, Salt Lake City, United States (2018)
|
|
BASE
|
|
Show details
|
|
4 |
Developing an Embosi (Bantu C25) Speech Variant Dictionary to Model Vowel Elision and Morpheme Deletion
|
|
|
|
In: Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-01837178 ; Annual Conference of the International Speech Communication Association , ISCA, Aug 2017, Stockholm, Sweden (2017)
|
|
BASE
|
|
Show details
|
|
5 |
Corpus base linguistic exploration via forced alignments with a ‘light-weight’ ASR tool
|
|
|
|
In: Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01837174 ; Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics, Nov 2017, Poznań, Poland (2017)
|
|
BASE
|
|
Show details
|
|
6 |
LIG-AIKUMA: a Mobile App to Collect Parallel Speech for Under-Resourced Language Studies
|
|
|
|
In: Interspeech 2016 proceedings ; Interspeech 2016 (short demo paper) ; https://hal.archives-ouvertes.fr/hal-01350062 ; Interspeech 2016 (short demo paper), Sep 2016, San-Francisco, France (2016)
|
|
BASE
|
|
Show details
|
|
7 |
BULB: Breaking the Unwritten Language Barrier
|
|
Adda, Gilles; Adda-Decker, Martine; Ambouroue, Odette; Besacier, Laurent; Blachon, David; Maynard, Hélène; Godard, Pierre; Hamlaoui, Fatima; Idiatov, Dmitry; Kouarata, Guy-Noël; Lamel, Lori; Makasso, Emmanuel-Moselly; Mariani, Joseph,; Rialland, Annie; Stücker, Sebastian; Van De Velde, Mark; Gauthier, Elodie; Yvon, François; Zerbian, Sabine
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
Abstract:
The project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aimsat supporting linguists in documenting unwritten languages. In order to achieve this we develop tools tailored to the needs ofdocumentary linguists by building upon technology and expertise from the area of natural language processing, most prominentlyautomatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourcedAfrican languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps:1) Collection of a large corpus of speech (100h per language) at a reasonable cost. For this we use standard mobile devices and adedicated software—Lig-Aikuma. After initial recording, the data is re-spoken by a reference speaker to enhance the signal qualityand orally translated into French.2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognizedBantu phonemes and French words will then be automatically aligned.3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists’ needs and technology’scapabilities.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; Endangered Languages; Low Resource Language
|
|
URL: https://hal.archives-ouvertes.fr/hal-01836496 https://doi.org/10.1016/j.procs.2016.04.023
|
|
BASE
|
|
Hide details
|
|
8 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
9 |
Preliminary Experiments on Unsupervised Word Discovery in Mboshi
|
|
|
|
In: Interspeech 2016 proceedings ; Interspeech 2016 ; https://hal.archives-ouvertes.fr/hal-01350119 ; Interspeech 2016, Sep 2016, San-Francisco, United States (2016)
|
|
BASE
|
|
Show details
|
|
10 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
11 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
12 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
13 |
BULB: Breaking the Unwritten Language Barrier
|
|
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
14 |
Dropping of the Class-Prefix Consonant, Vowel Elision and Automatic Phonological Mining in Embosi (Bantu C 25)
|
|
|
|
In: ISSN: 9781574734652 ; Selected Proceedings of the 44th Annual Conference on African Linguistics ; https://halshs.archives-ouvertes.fr/halshs-01251202 ; Selected Proceedings of the 44th Annual Conference on African Linguistics, Ruth Kramer, Elizabeth C. Zsiga, and One Tlale Boyer, Cascadilla Proceedings Project, 2015, pp. 221-230 (2015)
|
|
BASE
|
|
Show details
|
|
15 |
Embosi: automatic alignment with segments and words and phonological mining
|
|
|
|
In: International Conference on Bantu Languages ; https://hal.archives-ouvertes.fr/hal-01843438 ; International Conference on Bantu Languages, Jan 2013, Paris, France (2013)
|
|
BASE
|
|
Show details
|
|
16 |
Embosi : automatic alignment with segments and words and phonological mining
|
|
|
|
In: International Conference on Bantu Languages (BANTU 2013) ; https://halshs.archives-ouvertes.fr/halshs-01424894 ; International Conference on Bantu Languages (BANTU 2013), Jun 2013, Paris France (2013)
|
|
BASE
|
|
Show details
|
|
|
|