DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6 7...50
Hits 41 – 60 of 989

41
Unsupervised Data Selection via Discrete Speech Representation for ASR ...
Lu, Zhiyun; Wang, Yongqiang; Zhang, Yu. - : arXiv, 2022
BASE
Show details
42
Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE ...
BASE
Show details
43
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation ...
BASE
Show details
44
ADIMA: Abuse Detection In Multilingual Audio ...
BASE
Show details
45
Improving the fusion of acoustic and text representations in RNN-T ...
Abstract: The recurrent neural network transducer (RNN-T) has recently become the mainstream end-to-end approach for streaming automatic speech recognition (ASR). To estimate the output distributions over subword units, RNN-T uses a fully connected layer as the joint network to fuse the acoustic representations extracted using the acoustic encoder with the text representations obtained using the prediction network based on the previous subword units. In this paper, we propose to use gating, bilinear pooling, and a combination of them in the joint network to produce more expressive representations to feed into the output layer. A regularisation method is also proposed to enable better acoustic encoder training by reducing the gradients back-propagated into the prediction network at the beginning of RNN-T training. Experimental results on a multilingual ASR setting for voice search over nine languages show that the joint use of the proposed methods can result in 4%--5% relative word error rate reductions with only a few ... : Paper to appear at ICASSP 2022 ...
Keyword: Artificial Intelligence cs.AI; Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
URL: https://dx.doi.org/10.48550/arxiv.2201.10240
https://arxiv.org/abs/2201.10240
BASE
Hide details
46
Data and knowledge-driven approaches for multilingual training to improve the performance of speech recognition systems of Indian languages ...
BASE
Show details
47
Frequency-Directional Attention Model for Multilingual Automatic Speech Recognition ...
BASE
Show details
48
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques ...
Dinh, Tu Anh; Liu, Danni; Niehues, Jan. - : arXiv, 2022
BASE
Show details
49
AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning ...
BASE
Show details
50
Multimodal Clustering with Role Induced Constraints for Speaker Diarization ...
BASE
Show details
51
Cross-view Brain Decoding ...
BASE
Show details
52
Freeform Body Motion Generation from Speech ...
Xu, Jing; Zhang, Wei; Bai, Yalong. - : arXiv, 2022
BASE
Show details
53
Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition ...
Shao, Qijie; Yan, Jinghao; Kang, Jian. - : arXiv, 2022
BASE
Show details
54
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset ...
Yu, Tiezheng; Frieske, Rita; Xu, Peng. - : arXiv, 2022
BASE
Show details
55
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis ...
BASE
Show details
56
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers ...
BASE
Show details
57
The VoicePrivacy 2022 Challenge Evaluation Plan ...
BASE
Show details
58
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction ...
BASE
Show details
59
CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition ...
Chen, Chengxin; Zhang, Pengyuan. - : arXiv, 2022
BASE
Show details
60
Automatic Depression Detection: An Emotional Audio-Textual Corpus and a GRU/BiLSTM-based Model ...
Shen, Ying; Yang, Huiyu; Lin, Lin. - : arXiv, 2022
BASE
Show details

Page: 1 2 3 4 5 6 7...50

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
989
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern