1 |
Synthesizing Dysarthric Speech Using Multi-talker TTS for Dysarthric Speech Recognition ...
|
|
|
|
Abstract:
Dysarthria is a motor speech disorder often characterized by reduced speech intelligibility through slow, uncoordinated control of speech production muscles. Automatic Speech recognition (ASR) systems may help dysarthric talkers communicate more effectively. To have robust dysarthria-specific ASR, sufficient training speech is required, which is not readily available. Recent advances in Text-To-Speech (TTS) synthesis multi-speaker end-to-end TTS systems suggest the possibility of using synthesis for data augmentation. In this paper, we aim to improve multi-speaker end-to-end TTS systems to synthesize dysarthric speech for improved training of a dysarthria-specific DNN-HMM ASR. In the synthesized speech, we add dysarthria severity level and pause insertion mechanisms to other control parameters such as pitch, energy, and duration. Results show that a DNN-HMM model trained on additional synthetic dysarthric speech achieves WER improvement of 12.2% compared to the baseline, the addition of the severity level ... : Accepted ICASSP 2022 ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
|
|
URL: https://dx.doi.org/10.48550/arxiv.2201.11571 https://arxiv.org/abs/2201.11571
|
|
BASE
|
|
Hide details
|
|
2 |
Introducing Phonetic Information to Speaker Embedding for Speaker Verification
|
|
|
|
In: Electrical and Computer Engineering Faculty Publications (2019)
|
|
BASE
|
|
Show details
|
|
4 |
Advanced Recurrent Network-Based Hybrid Acoustic Models for Low Resource Speech Recognition
|
|
|
|
In: Electrical and Computer Engineering Faculty Publications (2018)
|
|
BASE
|
|
Show details
|
|
5 |
Investigation of Frame Alignments for GMM-based Digit-prompted Speaker Verification ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Comparison of Multiple Features and Modeling Methods for Text-dependent Speaker Verification ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Jaw Rotation in Dysarthria Measured With a Single Electromagnetic Articulography Sensor
|
|
|
|
In: Speech Pathology and Audiology Faculty Research and Publications (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Development of Kinematic Templates for Automatic Pronunciation Assessment Using Acoustic-to-Articulatory Inversion
|
|
|
|
In: Master's Theses (2009 -) (2017)
|
|
BASE
|
|
Show details
|
|
9 |
Acoustic sequences in non-human animals : a tutorial review and prospectus
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Analysis of Interference between Electromagnetic Articulography and Electroglottograph Systems
|
|
|
|
In: Master's Theses (2009 -) (2016)
|
|
BASE
|
|
Show details
|
|
11 |
Parallel Reference Speaker Weighting for Kinematic-Independent Acoustic-to-Articulatory Inversion
|
|
|
|
In: Speech Pathology and Audiology Faculty Research and Publications (2016)
|
|
BASE
|
|
Show details
|
|
12 |
Acoustic Sequences in Non-human Animals: A Tutorial Review and Prospectus
|
|
|
|
In: Electrical and Computer Engineering Faculty Research and Publications (2016)
|
|
BASE
|
|
Show details
|
|
13 |
Acoustic sequences in nonâ human animals: a tutorial review and prospectus
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Acoustic sequences in non-human animals: a tutorial review and prospectus.
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Embodied cognition, Latin pedagogy, and the rhetorical foundations of medieval vernacular poetry
|
|
|
|
BASE
|
|
Show details
|
|
18 |
The Electromagnetic Articulography Mandarin Accented English (EMA-MAE) Corpus of Acoustic and 3D Articulatory Kinematic Data
|
|
|
|
In: Speech Pathology and Audiology Faculty Research and Publications (2014)
|
|
BASE
|
|
Show details
|
|
19 |
Sensorimotor Adaptation of Speech Using Real-time Articulatory Resynthesis
|
|
|
|
In: Speech Pathology and Audiology Faculty Research and Publications (2014)
|
|
BASE
|
|
Show details
|
|
20 |
Physiologically-motivated Feature Extraction for Speaker Identification
|
|
|
|
In: Electrical and Computer Engineering Faculty Research and Publications (2014)
|
|
BASE
|
|
Show details
|
|
|
|