Page: 1 2 3 4 5 6 7 8... 567
62 |
Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition ...
|
|
Liu, Qianying; Yang, Yuhang; Gong, Zhuo; Li, Sheng; Ding, Chenchen; Minematsu, Nobuaki; Huang, Hao; Cheng, Fei; Kurohashi, Sadao. - : arXiv, 2022
|
|
Abstract:
Low resource speech recognition has been long-suffering from insufficient training data. While neighbour languages are often used as assistant training data, it would be difficult for the model to induct similar units (character, subword, etc.) across the languages. In this paper, we assume similar units in neighbour language share similar term frequency and form a Huffman tree to perform multi-lingual hierarchical Softmax decoding. During decoding, the hierarchical structure can benefit the training of low-resource languages. Experimental results show the effectiveness of our method. ... : 5 pages, Interspeech submission ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering
|
|
URL: https://arxiv.org/abs/2204.03855 https://dx.doi.org/10.48550/arxiv.2204.03855
|
|
BASE
|
|
Hide details
|
|
63 |
Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
65 |
Code-Switching Text Augmentation for Multilingual Speech Processing ...
|
|
|
|
BASE
|
|
Show details
|
|
67 |
Self-supervised Learning with Random-projection Quantizer for Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
69 |
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
71 |
Improving the fusion of acoustic and text representations in RNN-T ...
|
|
|
|
BASE
|
|
Show details
|
|
72 |
Data and knowledge-driven approaches for multilingual training to improve the performance of speech recognition systems of Indian languages ...
|
|
|
|
BASE
|
|
Show details
|
|
73 |
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques ...
|
|
|
|
BASE
|
|
Show details
|
|
75 |
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
76 |
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis ...
|
|
|
|
BASE
|
|
Show details
|
|
77 |
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers ...
|
|
|
|
BASE
|
|
Show details
|
|
79 |
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction ...
|
|
|
|
BASE
|
|
Show details
|
|
80 |
Fine-grained Noise Control for Multispeaker Speech Synthesis ...
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8... 567
|
|