101 |
BULB: Breaking the Unwritten Language Barrier
|
|
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
102 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
103 |
Preliminary Experiments on Unsupervised Word Discovery in Mboshi
|
|
|
|
In: Interspeech 2016 proceedings ; Interspeech 2016 ; https://hal.archives-ouvertes.fr/hal-01350119 ; Interspeech 2016, Sep 2016, San-Francisco, United States (2016)
|
|
BASE
|
|
Show details
|
|
104 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
105 |
Listen and Translate: A Proof of Concept for End-to-End Speech-to-Text Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
106 |
Inducing Multilingual Text Analysis Tools Using Bidirectional Recurrent Neural Networks ...
|
|
|
|
BASE
|
|
Show details
|
|
107 |
Word2Vec vs DBnary: Augmenting METEOR using Vector Representations or Lexical Resources? ...
|
|
|
|
BASE
|
|
Show details
|
|
108 |
Breaking the unwritten language barrier: the BULB project
|
|
|
|
In: SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages ; https://halshs.archives-ouvertes.fr/halshs-01428027 ; SLTU-2016 5th Workshop on Spoken Language Technologies for Under-resourced languages, May 2016, Yogyakarta, Indonesia. ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
BASE
|
|
Show details
|
|
109 |
Innovative technologies for under-resourced language documentation: The BULB Project
|
|
|
|
In: CCURL proceedings ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC ; https://hal.archives-ouvertes.fr/hal-01350124 ; Workshop CCURL 2016 - Collaboration and Computing for Under-Resourced Languages - LREC, May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
110 |
BULB: Breaking the Unwritten Language Barrier
|
|
Adda, Gilles; Adda-Decker, Martine; Ambouroue, Odette; Besacier, Laurent; Blachon, David; Maynard, Hélène; Godard, Pierre; Hamlaoui, Fatima; Idiatov, Dmitry; Kouarata, Guy-Noël; Lamel, Lori; Makasso, Emmanuel-Moselly; Mariani, Joseph,; Rialland, Annie; Stücker, Sebastian; Van De Velde, Mark; Gauthier, Elodie; Yvon, François; Zerbian, Sabine
|
|
In: Procedia Computer Science ; Computational Methods for Endangered Language Documentation and Description ; https://hal.archives-ouvertes.fr/hal-01836496 ; Computational Methods for Endangered Language Documentation and Description, May 2016, Yogyakarta, Indonesia. pp.8-14, ⟨10.1016/j.procs.2016.04.023⟩ (2016)
|
|
Abstract:
The project Breaking the Unwritten Language Barrier (BULB), which brings together linguists and computer scientists, aimsat supporting linguists in documenting unwritten languages. In order to achieve this we develop tools tailored to the needs ofdocumentary linguists by building upon technology and expertise from the area of natural language processing, most prominentlyautomatic speech recognition and machine translation. As a development and test bed for this we have chosen three less-resourcedAfrican languages from the Bantu family: Basaa, Myene and Embosi. Work within the project is divided into three main steps:1) Collection of a large corpus of speech (100h per language) at a reasonable cost. For this we use standard mobile devices and adedicated software—Lig-Aikuma. After initial recording, the data is re-spoken by a reference speaker to enhance the signal qualityand orally translated into French.2) Automatic transcription of the Bantu languages at phoneme level and the French translation at word level. The recognizedBantu phonemes and French words will then be automatically aligned.3) Tool development. In close cooperation and discussion with the linguists, the speech and language technologists will design and implement tools that will support the linguists in their work, taking into account the linguists’ needs and technology’scapabilities.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO]Computer Science [cs]; Endangered Languages; Low Resource Language
|
|
URL: https://hal.archives-ouvertes.fr/hal-01836496 https://doi.org/10.1016/j.procs.2016.04.023
|
|
BASE
|
|
Hide details
|
|
111 |
Introduction au numéro spécial sur le traitement automatique du langage parlé
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-01150048 ; Laurent Besacier et Wolfang Minker. France. 55 (2), Hermès, 2015, Traitement Automatique des Langues (2015)
|
|
BASE
|
|
Show details
|
|
112 |
Unsupervised Speaker Identification in TV Broadcast Based on Written Names
|
|
|
|
In: ISSN: 1558-7916 ; IEEE Transactions on Audio, Speech and Language Processing ; https://hal.archives-ouvertes.fr/hal-01060827 ; IEEE Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2015, 23 (1), pp.57-68. ⟨10.1109/TASLP.2014.2367822⟩ ; https://dl.acm.org/authorize?N46627 (2015)
|
|
BASE
|
|
Show details
|
|
113 |
An Open Source Toolkit for Word-level Confidence Estimation in Machine Translation
|
|
|
|
In: The 12th International Workshop on Spoken Language Translation (IWSLT'15) ; https://hal.archives-ouvertes.fr/hal-01244477 ; The 12th International Workshop on Spoken Language Translation (IWSLT'15), Dec 2015, Da Nang, Vietnam ; http://workshop2015.iwslt.org/ (2015)
|
|
BASE
|
|
Show details
|
|
114 |
Utilisation des réseaux de neurones récurrents pour la projection interlingue d'étiquettes morpho-syntaxiques à partir d'un corpus parallèle
|
|
|
|
In: Actes de la 22e conférence sur le Traitement Automatique des Langues Naturelles ; TALN 2015 ; https://hal.archives-ouvertes.fr/hal-01350115 ; TALN 2015, Jul 2015, Caen, France (2015)
|
|
BASE
|
|
Show details
|
|
115 |
LIG at MediaEval 2015 Multimodal Person Discovery in Broadcast TV Task
|
|
|
|
In: Working Notes Proceedings of the MediaEval 2015 Workshop Wurzen, Germany, September 14-15, 2015 ; MediaEval 2015 Workshop ; https://hal.archives-ouvertes.fr/hal-01350079 ; MediaEval 2015 Workshop, Sep 2015, Wurzen, Germany (2015)
|
|
BASE
|
|
Show details
|
|
116 |
Unsupervised and Lightly Supervised Part-of-Speech Tagging Using Recurrent Neural Networks
|
|
|
|
In: The 29th Pacific Asia Conference on Language, Information and Computation ; 29th Pacific Asia Conference on Language, Information and Computation (PACLIC) ; https://hal.archives-ouvertes.fr/hal-01350113 ; 29th Pacific Asia Conference on Language, Information and Computation (PACLIC), Oct 2015, Shangai, China (2015)
|
|
BASE
|
|
Show details
|
|
117 |
Collaborative Annotation for Person Identification in TV Shows
|
|
|
|
In: Interspeech 2015 (short demo paper) ; https://hal.archives-ouvertes.fr/hal-01170513 ; Interspeech 2015 (short demo paper), Sep 2015, Dresden, Germany (2015)
|
|
BASE
|
|
Show details
|
|
|
|