1 |
"This is Houston. Say again, please". The Behavox system for the Apollo-11 Fearless Steps Challenge (phase II) ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Conversational telephone speech recognition for Lithuanian
|
|
|
|
In: ISSN: 0885-2308 ; EISSN: 1095-8363 ; Computer Speech and Language ; https://hal.archives-ouvertes.fr/hal-01837147 ; Computer Speech and Language, Elsevier, 2018, 49, pp.71-82 (2018)
|
|
BASE
|
|
Show details
|
|
3 |
An investigation into language model data augmentation for low-resourced STT and KWS
|
|
|
|
In: IEEE International Conference on Acoustics, Speech, and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01837171 ; IEEE International Conference on Acoustics, Speech, and Signal Processing, IEEE, Mar 2017, New Orleans, United States (2017)
|
|
BASE
|
|
Show details
|
|
4 |
Language Model Data Augmentation for Keyword Spotting
|
|
|
|
In: Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-01837186 ; Annual Conference of the International Speech Communication Association , Jan 2016, San Francisco, United States (2016)
|
|
BASE
|
|
Show details
|
|
5 |
Machine Translation Based Data Augmentation for Cantonese Keyword Spotting (Author's Manuscript)
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Explicit trajectories and speaker class modeling for child and adult speech recognition ; Modélisation de trajectoires et de classes de locuteurs pour la reconnaissance de voix d'enfants et d'adultes
|
|
|
|
In: XXXème édition des Journées d'Etudes sur la Parole ; https://hal.inria.fr/hal-01080343 ; XXXème édition des Journées d'Etudes sur la Parole, Jun 2014, Le Mans, France (2014)
|
|
BASE
|
|
Show details
|
|
7 |
Component Structuring and Trajectory Modeling for Speech Recognition
|
|
|
|
In: Interspeech ; https://hal.inria.fr/hal-01063653 ; Interspeech, Sep 2014, Singapoore, Singapore (2014)
|
|
Abstract:
International audience ; When the speech data are produced by speakers of different age and gender, the acoustic variability of any given phonetic unit becomes large, which degrades speech recognition performance. A way to go beyond the conventional Hidden Markov Model is to explicitly include speaker class information in the modeling. Speaker classes can be obtained by unsupervised clustering of the speech utterances. This paper introduces a structuring of the Gaussian compo- nents of the GMM densities with respect to speaker classes. In a first approach, the structuring of the Gaussian components is combined with speaker class-dependent mixture weights. In a second approach, the structuring is used with mixture transition matrices, which add dependencies between Gaussian components of mixture densities (as in stranded GMMs). The different approaches are evaluated and compared in detail on the TIDIGITS task. Significant improvements are obtained using the proposed approaches based on structured components. Additional results are reported for phonetic decoding on the NEOLOGOS database, a large corpus of French telephone data.
|
|
Keyword:
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
|
|
URL: https://hal.inria.fr/hal-01063653/document https://hal.inria.fr/hal-01063653/file/inter2014_agorin_v11.pdf https://hal.inria.fr/hal-01063653
|
|
BASE
|
|
Hide details
|
|
8 |
Efficient constrained parametrization of GMM with class-based mixture weights for Automatic Speech Recognition
|
|
|
|
In: LTC'13 - 6th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics ; https://hal.inria.fr/hal-00923202 ; LTC'13 - 6th Language & Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Dec 2013, Poznań, Poland (2013)
|
|
BASE
|
|
Show details
|
|
|
|