61 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
62 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
63 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
64 |
MaSS - Multilingual corpus of Sentence-aligned Spoken utterances ...
|
|
|
|
BASE
|
|
Show details
|
|
65 |
MaSS - Multilingual corpus of Sentence-aligned Spoken utterances ...
|
|
|
|
Abstract:
Abstract The CMU Wilderness Multilingual Speech Dataset is a newly published multilingual speech dataset based on recorded readings of the New Testament. It provides data to build Automatic Speech Recognition (ASR) and Text-to-Speech (TTS) models for potentially 700 languages. However, the fact that the source content (the Bible), is the same for all the languages is not exploited to date. Therefore, this article proposes to add multilingual links between speech segments in different languages, and shares a large and clean dataset of 8,130 para-lel spoken utterances across 8 languages (56 language pairs).We name this corpus MaSS (Multilingual corpus of Sentence-aligned Spoken utterances). The covered languages (Basque, English, Finnish, French, Hungarian, Romanian, Russian and Spanish) allow researches on speech-to-speech alignment as well as on translation for syntactically divergent language pairs. The quality of the final corpus is attested by human evaluation performed on a corpus subset (100 utterances, ...
|
|
Keyword:
parallel speech corpus, multilingual alignment, speech-to-speech alignment, speech-to-speech translation
|
|
URL: https://dx.doi.org/10.5281/zenodo.3354711 https://zenodo.org/record/3354711
|
|
BASE
|
|
Hide details
|
|
66 |
How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
67 |
Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
68 |
Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese ...
|
|
|
|
BASE
|
|
Show details
|
|
69 |
ASR performance prediction on unseen broadcast programs using convolutional neurol networks
|
|
|
|
In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-01709779 ; IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
70 |
A small Griko-Italian speech translation corpus
|
|
|
|
In: 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18) ; https://hal.archives-ouvertes.fr/hal-01962528 ; 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18), Aug 2018, New Delhi, India (2018)
|
|
BASE
|
|
Show details
|
|
71 |
A Very Low Resource Language Speech Corpus for Computational Language Documentation Experiments
|
|
|
|
In: Language Resources and Evaluation Conference (LREC) ; https://hal.archives-ouvertes.fr/hal-01807093 ; Language Resources and Evaluation Conference (LREC), Nicoletta Calzolari (Conference chair) and Khalid Choukri and Christopher Cieri and Thierry Declerck and Sara Goggi and Koiti Hasida and Hitoshi Isahara and Bente Maegaard and Joseph Mariani and Hélène Mazo and Asuncion Moreno and Jan Odijk and Stelios Pi, May 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
72 |
Unsupervised Word Segmentation from Speech with Attention
|
|
|
|
In: Interspeech 2018 ; https://hal.archives-ouvertes.fr/hal-01818092 ; Interspeech 2018, Sep 2018, Hyderabad, India (2018)
|
|
BASE
|
|
Show details
|
|
73 |
Unlocking cultural conceptualisation in indigenous language resources: collaborative computing methodologies
|
|
|
|
In: Dorn, Amelie, Wandl-Vogt, Eveline orcid:0000-0002-0802-0255 , Abgaz, Yalemisew orcid:0000-0002-3887-5342 , Benito Santos, Alejandro orcid:0000-0001-5317-6390 and Therón, Roberto orcid:0000-0001-6739-8875 (2018) Unlocking cultural conceptualisation in indigenous language resources: collaborative computing methodologies. In: 11th Language Resources and Evaluation Conference, 7-12 May 2018, Miyazaki, Japan. ISBN 979-10-95546-22-1 (2018)
|
|
BASE
|
|
Show details
|
|
74 |
Bayesian models for unit discovery on a very low resource language
|
|
|
|
In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-01709589 ; IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
75 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the “Speaking rosetta” JSALT 2017 workshop
|
|
|
|
In: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01709578 ; ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
76 |
Unsupervised Word Segmentation: does tone matter ?
|
|
|
|
In: International Conference on Intelligent Text Processing and Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01910756 ; International Conference on Intelligent Text Processing and Computational Linguistics, Mar 2018, Hanoï, Vietnam (2018)
|
|
BASE
|
|
Show details
|
|
77 |
Automatic Recognition of Affective Laughter in Spontaneous Dyadic Interactions from Audiovisual Signals
|
|
|
|
In: International Conference on Multimodal Interaction (ICMI 2018) ; https://hal.archives-ouvertes.fr/hal-01994000 ; International Conference on Multimodal Interaction (ICMI 2018), Oct 2018, Boulder, CO, United States. pp.220-228, ⟨10.1145/3242969.3243012⟩ (2018)
|
|
BASE
|
|
Show details
|
|
78 |
Token-level and sequence-level loss smoothing for RNN language models
|
|
|
|
In: ACL - 56th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-01790879 ; ACL - 56th Annual Meeting of the Association for Computational Linguistics, Jul 2018, Melbourne, Australia. pp.2094-2103 ; https://aclanthology.info/papers/P18-1195/p18-1195 (2018)
|
|
BASE
|
|
Show details
|
|
79 |
Adaptor Grammars for the Linguist: Word Segmentation Experiments for Very Low-Resource Languages
|
|
|
|
In: Workshop on Computational Research in Phonetics, Phonology, and Morphology ; https://hal.archives-ouvertes.fr/hal-01910757 ; Workshop on Computational Research in Phonetics, Phonology, and Morphology, Oct 2018, Bruxelles, Belgium. pp.32 - 42, ⟨10.18653/v1/P17⟩ (2018)
|
|
BASE
|
|
Show details
|
|
80 |
Parallel Corpora in Mboshi (Bantu C25, Congo-Brazzaville)
|
|
|
|
In: 11th edition of the Language Resources and Evaluation Conference (LREC 2018) ; https://hal.archives-ouvertes.fr/hal-01710043 ; 11th edition of the Language Resources and Evaluation Conference (LREC 2018), ELRA, May 2018, Miyazaki, Japan (2018)
|
|
BASE
|
|
Show details
|
|
|
|