Home
Catalogue search
Refine your search:
Keyword
Creator / Publisher:
ANR-14-CE35-0002,BULB,Breaking the Unwritten Language Barrier(2014) (1)
Adda, G (1)
Adda-Decker, Martine (1)
Benjumea, J (1)
Besacier, Laurent (1)
Cooper-Leavitt, J (1)
Godard, P. (1)
Groupe d’Étude en Traduction Automatique / Traitement Automatisé des Langues et de la Parole (GETALP ) (1)
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 ) (1)
Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 )-Institut polytechnique de Grenoble - Grenoble Institute of Technology (Grenoble INP )-Centre National de la Recherche Scientifique (CNRS)-Université Grenoble Alpes 2016-2019 (UGA 2016-2019 ) (1)
more
Year:
2018 (1)
Medium
Type
BLLDB-Access
Search in the Catalogues and Directories
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
Sort by
creator [A → Z]
'
creator [Z → A]
'
publishing year ↑ (asc)
'
publishing year ↓ (desc)
'
title [A → Z]
'
title [Z → A]
'
Simple Search
Hits 1 – 1 of 1
1
A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
Godard, P.
;
Adda, G
;
Adda-Decker, Martine
;
Benjumea, J
;
Besacier, Laurent
;
Cooper-Leavitt, J
;
Kouarata, G-N
;
Lamel, L
;
Maynard, H
;
Müller, M.
;
Rialland, A
;
Stüker, S.
;
Yvon, F.
;
Zanon-Boito, M
In: Language Resources and Evaluation Conference (LREC) ; https://hal.archives-ouvertes.fr/hal-01807093 ; Language Resources and Evaluation Conference (LREC), Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Pi, May 2018, Miyazaki, Japan (2018)
Abstract:
International audience ; Most speech and language technologies are trained with massive amounts of speech and text information. However, most of the world languages do not have such resources and some even lack a stable orthography. Building systems under these almost zero resource conditions is not only promising for speech technology but also for computational language documentation. The goal of computational language documentation is to help field linguists to (semi-)automatically analyze and annotate audio recordings of endangered, unwritten languages. Example tasks are automatic phoneme discovery or lexicon discovery from the speech signal. This paper presents a speech corpus collected during a realistic language documentation process. It is made up of 5k speech utterances in Mboshi (Bantu C25) aligned to French text translations. Speech transcriptions are also made available: they correspond to a non-standard graphemic form close to the language phonology. We detail how the data was collected, cleaned and processed and we illustrate its use through a zero-resource task: spoken term discovery. The dataset is made available to the community for reproducible computational language documentation experiments and their evaluation.
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
;
field linguistics
;
language documentation
;
spoken term discovery
;
unwritten languages
;
word segmentation
;
zero resource technologies
URL:
https://hal.archives-ouvertes.fr/hal-01807093/document
https://hal.archives-ouvertes.fr/hal-01807093/file/lrec2018_mboshi_final-3.pdf
https://hal.archives-ouvertes.fr/hal-01807093
BASE
Hide details
Mobile view
All
Catalogues
UB Frankfurt Linguistik
0
IDS Mannheim
0
OLC Linguistik
0
UB Frankfurt Retrokatalog
0
DNB Subject Category Language
0
Institut für Empirische Sprachwissenschaft
0
Leibniz-Centre General Linguistics (ZAS)
0
Bibliographies
BLLDB
0
BDSL
0
IDS Bibliografie zur deutschen Grammatik
0
IDS Bibliografie zur Gesprächsforschung
0
IDS Konnektoren im Deutschen
0
IDS Präpositionen im Deutschen
0
IDS OBELEX meta
0
MPI-SHH Linguistics Collection
0
MPI for Psycholinguistics
0
Linked Open Data catalogues
Annohub
0
Online resources
Link directory
0
Journal directory
0
Database directory
0
Dictionary directory
0
Open access documents
BASE
1
Linguistik-Repository
0
IDS Publikationsserver
0
Online dissertations
0
Language Description Heritage
0
© 2013 - 2024 Lin|gu|is|tik
|
Imprint
|
Privacy Policy
|
Datenschutzeinstellungen ändern