DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4
Hits 1 – 20 of 62

1
Applying phonetics : speech science in everyday life
Munro, Murray J.. - Chichester, West Sussex : Wiley Blackwell, 2021
BLLDB
UB Frankfurt Linguistik
Show details
2
Automatic Speech Recognition systems errors for accident-prone sleepiness detection through voice
In: EUSIPCO 2021 ; https://hal.archives-ouvertes.fr/hal-03324033 ; EUSIPCO 2021, Aug 2021, Dublin (en ligne), Ireland. ⟨10.23919/EUSIPCO54536.2021.9616299⟩ (2021)
BASE
Show details
3
Automatic Speech Recognition systems errors for objective sleepiness detection through voice
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03328827 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.2476-2480, ⟨10.21437/Interspeech.2021-291⟩ (2021)
BASE
Show details
4
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
In: INTERSPEECH 2021: Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03317730 ; INTERSPEECH 2021: Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
5
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
In: INTERSPEECH 2021: ; INTERSPEECH 2021: Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03317730 ; INTERSPEECH 2021: Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
6
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
In: INTERSPEECH 2021: ; INTERSPEECH 2021: Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03317730 ; INTERSPEECH 2021: Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
7
Re-synchronization using the Hand Preceding Model for Multi-modal Fusion in Automatic Continuous Cued Speech Recognition
In: ISSN: 1520-9210 ; IEEE Transactions on Multimedia ; https://hal.archives-ouvertes.fr/hal-02433830 ; IEEE Transactions on Multimedia, Institute of Electrical and Electronics Engineers, 2021, 23, pp.292-305. ⟨10.1109/TMM.2020.2976493⟩ (2021)
BASE
Show details
8
Identifying Speaker State from Multimodal Cues
Yang, Zixiaofan. - 2021
BASE
Show details
9
Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios
Eskander, Ramy. - 2021
BASE
Show details
10
Recognizing lexical units in low-resource language contexts with supervised and unsupervised neural networks
In: https://hal.archives-ouvertes.fr/hal-03429051 ; [Research Report] LACITO (UMR 7107). 2021 (2021)
BASE
Show details
11
Speech Normalization and Data Augmentation Techniques Based on Acoustical and Physiological Constraints and Their Applications to Child Speech Recognition
Yeung, Gary Joseph. - : eScholarship, University of California, 2021
BASE
Show details
12
Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
Heba, Abdelwahab. - : HAL CCSD, 2021
In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
BASE
Show details
13
Supplementary material to the paper The VoicePrivacy 2020 Challenge: Results and findings
In: https://hal.archives-ouvertes.fr/hal-03335126 ; 2021 (2021)
BASE
Show details
14
Supplementary material to the paper The VoicePrivacy 2020 Challenge: Results and findings
In: https://hal.archives-ouvertes.fr/hal-03335126 ; 2021 (2021)
BASE
Show details
15
The VoicePrivacy 2020 Challenge: Results and findings
In: https://hal.archives-ouvertes.fr/hal-03332224 ; 2021 (2021)
BASE
Show details
16
Supplementary material to the paper The VoicePrivacy 2020 Challenge: Results and findings
In: https://hal.archives-ouvertes.fr/hal-03335126 ; 2021 (2021)
BASE
Show details
17
The VoicePrivacy 2020 Challenge: Results and findings
In: https://hal.archives-ouvertes.fr/hal-03332224 ; 2021 (2021)
BASE
Show details
18
Enhancing Speech Privacy with Slicing
In: https://hal.inria.fr/hal-03369137 ; 2021 (2021)
BASE
Show details
19
Supplementary material to the paper The VoicePrivacy 2020 Challenge: Results and findings
In: https://hal.archives-ouvertes.fr/hal-03335126 ; 2021 (2021)
BASE
Show details
20
Training RNN Language Models on Uncertain ASR Hypotheses in Limited Data Scenarios
In: https://hal.inria.fr/hal-03327306 ; 2021 (2021)
Abstract: Training domain-specific automatic speech recognition (ASR) systems requires a suitable amount of data comprising the target domain. In several scenarios, such as early development stages, privacy-critical applications, or under-resourced languages, only a limited amount of in-domain speech data and an even smaller amount of manual text transcriptions, if any, are available. This motivates the study of ASR language models (LMs) learned from a limited amount of in-domain speech data. Early works have attempted training of n-gram LMs from ASR N-best lists and lattices but training and adaptation of recurrent neural network (RNN) LMs from ASR transcripts has not received attention. In this work, we study training and adaptation of RNN LMs using alternate and uncertain ASR hypotheses embedded in ASR confusion networks obtained from target domain speech data.We explore different methods for training the RNN LMs to deal with the uncertain input sequences. The first method extends the cross-entropy objective into a Kullback–Leibler (KL) divergence based training loss, the second method formulates a training loss based on a hidden Markov model (HMM), and the third method performs training on paths sampled from the confusion networks. These methods are applied to limited data setups including telephone and meeting conversation datasets. Performance is evaluated under two settings wherein no manual transcriptions or a small amount of manual transcriptions are available to aid the training. Moreover, a model adaptation setting is also evaluated wherein the RNN LM is pre-trained on an out-of-domain conversational corpus. Overall the sampling method for training RNN LMs on ASR confusion networks performs the best, and results in up to 12% relative reduction in perplexity on the meeting dataset as compared to training on ASR 1-best hypotheses, without any manual transcriptions.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]; automatic speech recognition; confusion networks; language models; recurrent neural networks
URL: https://hal.inria.fr/hal-03327306/document
https://hal.inria.fr/hal-03327306/file/cn2lm_manuscript.pdf
https://hal.inria.fr/hal-03327306
BASE
Hide details

Page: 1 2 3 4

Catalogues
1
0
0
0
0
0
0
Bibliographies
1
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
61
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern