1 |
Comparaison dialectométriques de parlers du Croissant avec d'autres parlers d'oc et d'oïl
|
|
|
|
In: Le Croissant linguistique entre oc, oïl et francoprovençal : des mots à la grammaire, des parlers aux aires ; https://hal.archives-ouvertes.fr/hal-03318765 ; Le Croissant linguistique entre oc, oïl et francoprovençal : des mots à la grammaire, des parlers aux aires, 2021 (2021)
|
|
BASE
|
|
Show details
|
|
2 |
A Speaking Atlas of Minority Languages of France: Collection and Analyses of Dialectical Data
|
|
|
|
In: International Congress of Phonetic Sciences ; https://hal.archives-ouvertes.fr/hal-02387368 ; International Congress of Phonetic Sciences, Sasha Calhoun, Paola Escudero, Marija Tabain and Paul Warren (Eds.), Aug 2019, Melbourne, Australia (2019)
|
|
BASE
|
|
Show details
|
|
3 |
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
|
|
|
|
In: 11th edition of the Language Resources and Evaluation Conference (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01710043 ; 11th edition of the Language Resources and Evaluation Conference (LREC 2018), ELRA, May 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
4 |
A corpus based study of morpheme deletion in a low resourced language: A case study for Embosi
|
|
|
|
In: Annual Meeting of the Linguistic Society of America ; https://hal.archives-ouvertes.fr/hal-01837164 ; Annual Meeting of the Linguistic Society of America, Jan 2018, Salt Lake City, United States (2018)
|
|
BASE
|
|
Show details
|
|
5 |
Developing an Embosi (Bantu C25) Speech Variant Dictionary to Model Vowel Elision and Morpheme Deletion
|
|
|
|
In: Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-01837178 ; Annual Conference of the International Speech Communication Association , ISCA, Aug 2017, Stockholm, Sweden (2017)
|
|
BASE
|
|
Show details
|
|
6 |
Corpus base linguistic exploration via forced alignments with a ‘light-weight’ ASR tool
|
|
|
|
In: Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01837174 ; Language & Technology Conference : Human Language Technologies as a Challenge for Computer Science and Linguistics, Nov 2017, Poznań, Poland (2017)
|
|
BASE
|
|
Show details
|
|
7 |
BULB: Breaking the Unwritten Language Barrier
|
|
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
8 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
9 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
10 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
11 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
Lamel, Lori; Makasso, Emmanuel-Moselly; Rialland, Annie; Yvon, François; Besacier, Laurent; Gauthier, Elodie; Blachon, David; Van De Velde, Mark; Godard, Pierre; Ene Bonneau-Maynard, Héì; Stuker, Sebastian; Hamlaoui, Fatima; Ambouroue, Odette; Adda-Decker, Martine; Zerbian, Sabine; Kouarata, Guy-Noël; Adda, Gilles; Idiatov, Dmitry
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
Abstract:
International audience ; The project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aims at supporting linguists in documenting unwritten languages. In order to achieve this we will develop tools tailored to the needs of documentary linguists by building upon technology and expertise from the area of natural language processing, most prominently automatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourced African languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps: 1) Collection of a large corpus of speech (100h per language) at a reasonable cost. After initial recording, the data is re-spoken by a reference speaker to enhance the signal quality and orally translated into French. 2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognized Bantu phonemes and French words will then be automatically aligned. 3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists' needs and technology's capabilities. The data collection has begun for the three languages. For this we use standard mobile devices and a dedicated software—LIG-AIKUMA, which proposes a range of different speech collection modes (recording, respeaking, translation and elicitation). LIG-AIKUMA 's improved features include a smart generation and handling of speaker metadata as well as respeaking and parallel audio data mapping.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; automatic alignment; automatic phonetic transcription; Language documentation; unwritten languages
|
|
URL: https://hal.archives-ouvertes.fr/hal-01350124 https://hal.archives-ouvertes.fr/hal-01350124/document https://hal.archives-ouvertes.fr/hal-01350124/file/CCURL_BULB_2016.pdf
|
|
BASE
|
|
Hide details
|
|
12 |
BULB: Breaking the Unwritten Language Barrier
|
|
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
13 |
Automatic language identity tagging on word and sentence-level in multilingual text sources: a case-study on Luxembourgish
|
|
|
|
In: International Conference on Language Resources and Evaluation ; https://hal.archives-ouvertes.fr/hal-01843401 ; International Conference on Language Resources and Evaluation, May 2014, Reykjavik, Iceland (2014)
|
|
BASE
|
|
Show details
|
|
14 |
Automatic Language Identity Tagging on Word and Sentence-Level in Multilingual Text Sources: a Case-Study on Luxembourgish
|
|
|
|
In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC'14) ; Ninth International Conference on Language Resources and Evaluation (LREC'14) ; https://hal.archives-ouvertes.fr/hal-01134776 ; Ninth International Conference on Language Resources and Evaluation (LREC'14), European Language Resources Association (ELRA), May 2014, Reykjavik, Iceland. pp.3300-3304 ; http://lrec2014.lrec-conf.org/en/ (2014)
|
|
BASE
|
|
Show details
|
|
15 |
Modélisation acoustico-phonétique de langues peu dotées : Études phonétiques et travaux de reconnaissance automatique en luxembourgois
|
|
|
|
In: Journées d'Etude sur la Parole ; https://hal.archives-ouvertes.fr/hal-01843399 ; Journées d'Etude sur la Parole, Jan 2014, Le Mans, France (2014)
|
|
BASE
|
|
Show details
|
|
16 |
Speech Alignment and Recognition Experiments for Luxembourgish
|
|
|
|
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; https://hal.archives-ouvertes.fr/hal-01134824 ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages, May 2014, Saint-Petersbourg, Russia. pp.53-60 ; http://www.mica.edu.vn/sltu2014/ (2014)
|
|
BASE
|
|
Show details
|
|
17 |
A First LVCSR System for Luxembourgish, a Low-Resourced European Language
|
|
|
|
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135103 ; Zygmunt Vetulani; Joseph Mariani. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.479-490, 2014, 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25--27, 2011, Revised Selected Papers, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_39⟩ (2014)
|
|
BASE
|
|
Show details
|
|
18 |
What we can learn from ASR errors about low-resourced languages: a case- study of Luxembourgish and Austrian
|
|
|
|
In: Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing ; https://hal.archives-ouvertes.fr/hal-01843440 ; Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing, Jan 2013, Ermenonville, France (2013)
|
|
BASE
|
|
Show details
|
|
19 |
What we can learn from asr errors about low-resourced languages: a case-study of luxembourgish and austrian
|
|
|
|
In: Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing (ERRARE 2013) ; https://halshs.archives-ouvertes.fr/halshs-01424902 ; Errors by Humans and Machines in Multimedia, Multimodal, Multilingual Data Processing (ERRARE 2013), Nov 2013, Ermenonville, France (2013)
|
|
BASE
|
|
Show details
|
|
20 |
Systèmes de transcription comme instruments
|
|
|
|
In: Méthodes et outils pour l'analyse phonétique des grands corpus oraux ; https://hal.archives-ouvertes.fr/hal-01135113 ; Nguyen Noël; Adda-Decker Martine. Méthodes et outils pour l'analyse phonétique des grands corpus oraux, Hermes Science Publications, pp.159-202, 2013, Cognition et Traitement de l'Information, 978-2746245303 (2013)
|
|
BASE
|
|
Show details
|
|
|
|