1 |
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition ...
|
|
|
|
Abstract:
Unpaired data has shown to be beneficial for low-resource automatic speech recognition~(ASR), which can be involved in the design of hybrid models with multi-task training or language model dependent pre-training. In this work, we leverage unpaired data to train a general sequence-to-sequence model. Unpaired speech and text are used in the form of data pairs by generating the corresponding missing parts in prior to model training. Inspired by the complementarity of speech-PseudoLabel pair and SynthesizedAudio-text pair in both acoustic features and linguistic features, we propose a complementary joint training~(CJT) method that trains a model alternatively with two data pairs. Furthermore, label masking for pseudo-labels and gradient restriction for synthesized audio are proposed to further cope with the deviations from real data, termed as CJT++. Experimental results show that compared to speech-only training, the proposed basic CJT achieves great performance improvements on clean/other test sets, and the ... : 5 pages, 3 figures ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; I.2.7; Sound cs.SD
|
|
URL: https://dx.doi.org/10.48550/arxiv.2204.02023 https://arxiv.org/abs/2204.02023
|
|
BASE
|
|
Hide details
|
|
4 |
Functional maps of direct electrical stimulation-induced speech arrest and anomia: a multicentre retrospective study.
|
|
|
|
In: Brain : a journal of neurology, vol 144, iss 8 (2021)
|
|
BASE
|
|
Show details
|
|
5 |
The representation of variable tone sandhi patterns in Shanghai Wu
|
|
|
|
In: Laboratory Phonology: Journal of the Association for Laboratory Phonology; Vol 12, No 1 (2021); 15 ; 1868-6354 (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Tracing Text Provenance via Context-Aware Lexical Substitution ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Fixel-based evidence of microstructural damage in crossing pathways improves language mapping in Post-stroke aphasia
|
|
|
|
In: Neuroimage Clin (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Electrophysiological Signatures of Perceiving Alternated Tone in Mandarin Chinese: Mismatch Negativity to Underlying Tone Conflict
|
|
|
|
In: Front Psychol (2021)
|
|
BASE
|
|
Show details
|
|
9 |
The Evaluation Model of College Students' Mental Health in the Environment of Independent Entrepreneurship Using Neural Network Technology
|
|
|
|
In: J Healthc Eng (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Functional maps of direct electrical stimulation-induced speech arrest and anomia: a multicentre retrospective study
|
|
|
|
In: Brain (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Where is the speech production area? Evidence from direct cortical electrical stimulation mapping
|
|
|
|
In: Brain (2021)
|
|
BASE
|
|
Show details
|
|
12 |
The genetic determinants of language network dysconnectivity in drug-naïve early stage schizophrenia
|
|
|
|
In: Brain and Mind Institute Researchers' Publications (2021)
|
|
BASE
|
|
Show details
|
|
13 |
The genetic determinants of language network dysconnectivity in drug-naïve early stage schizophrenia
|
|
|
|
In: Brain and Mind Institute Researchers' Publications (2021)
|
|
BASE
|
|
Show details
|
|
14 |
The genetic determinants of language network dysconnectivity in drug-naïve early stage schizophrenia
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Linguistic diversity in a time of crisis: Language challenges of the COVID-19 pandemic. - Multilingua : Linguistic diversity in a time of crisis: Language challenges of the COVID-19 pandemic. -
|
|
|
|
IDS Mannheim
|
|
Show details
|
|
16 |
A Corpus-Based Investigation of Manner/State Complement Constructions in Mandarin Chinese ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Chinese postgraduate taught students' transitional experience in the UK: the role of social connections
|
|
|
|
BASE
|
|
Show details
|
|
18 |
The /n/-/ŋ/ Asymmetry upon /ɻ/-Suffixation in Beijing and Elsewhere -- A Phonetically Based OT Analysis
|
|
|
|
In: North East Linguistics Society (2020)
|
|
BASE
|
|
Show details
|
|
19 |
Phonetic Duration Effects on Contour Tone Distribution
|
|
|
|
In: North East Linguistics Society (2020)
|
|
BASE
|
|
Show details
|
|
20 |
Poetry Translation From the Perspective of Creative Treason: Based on the Analysis of Xu Yuanzhong’s Translation of Spring View
|
|
|
|
In: Cross-Cultural Communication; Vol 16, No 2 (2020): Cross-Cultural Communication; 64-68 ; 1923-6700 ; 1712-8358 (2020)
|
|
BASE
|
|
Show details
|
|
|
|