DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6 7...50
Hits 41 – 60 of 989

41
Unsupervised Data Selection via Discrete Speech Representation for ASR ...
Lu, Zhiyun; Wang, Yongqiang; Zhang, Yu. - : arXiv, 2022
BASE
Show details
42
Analysis of Voice Conversion and Code-Switching Synthesis Using VQ-VAE ...
BASE
Show details
43
CVSS Corpus and Massively Multilingual Speech-to-Speech Translation ...
BASE
Show details
44
ADIMA: Abuse Detection In Multilingual Audio ...
BASE
Show details
45
Improving the fusion of acoustic and text representations in RNN-T ...
Zhang, Chao; Li, Bo; Lu, Zhiyun. - : arXiv, 2022
BASE
Show details
46
Data and knowledge-driven approaches for multilingual training to improve the performance of speech recognition systems of Indian languages ...
BASE
Show details
47
Frequency-Directional Attention Model for Multilingual Automatic Speech Recognition ...
BASE
Show details
48
Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques ...
Dinh, Tu Anh; Liu, Danni; Niehues, Jan. - : arXiv, 2022
BASE
Show details
49
AVQVC: One-shot Voice Conversion by Vector Quantization with applying contrastive learning ...
BASE
Show details
50
Multimodal Clustering with Role Induced Constraints for Speaker Diarization ...
BASE
Show details
51
Cross-view Brain Decoding ...
BASE
Show details
52
Freeform Body Motion Generation from Speech ...
Xu, Jing; Zhang, Wei; Bai, Yalong. - : arXiv, 2022
BASE
Show details
53
Linguistic-Acoustic Similarity Based Accent Shift for Accent Recognition ...
Shao, Qijie; Yan, Jinghao; Kang, Jian. - : arXiv, 2022
BASE
Show details
54
Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset ...
Yu, Tiezheng; Frieske, Rita; Xu, Peng. - : arXiv, 2022
BASE
Show details
55
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis ...
BASE
Show details
56
Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers ...
BASE
Show details
57
The VoicePrivacy 2022 Challenge Evaluation Plan ...
BASE
Show details
58
A Character-level Span-based Model for Mandarin Prosodic Structure Prediction ...
Abstract: The accuracy of prosodic structure prediction is crucial to the naturalness of synthesized speech in Mandarin text-to-speech system, but now is limited by widely-used sequence-to-sequence framework and error accumulation from previous word segmentation results. In this paper, we propose a span-based Mandarin prosodic structure prediction model to obtain an optimal prosodic structure tree, which can be converted to corresponding prosodic label sequence. Instead of the prerequisite for word segmentation, rich linguistic features are provided by Chinese character-level BERT and sent to encoder with self-attention architecture. On top of this, span representation and label scoring are used to describe all possible prosodic structure trees, of which each tree has its corresponding score. To find the optimal tree with the highest score for a given sentence, a bottom-up CKY-style algorithm is further used. The proposed method can predict prosodic labels of different levels at the same time and accomplish the ... : Accepted by ICASSP 2022 ...
Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
URL: https://arxiv.org/abs/2203.16922
https://dx.doi.org/10.48550/arxiv.2203.16922
BASE
Hide details
59
CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition ...
Chen, Chengxin; Zhang, Pengyuan. - : arXiv, 2022
BASE
Show details
60
Automatic Depression Detection: An Emotional Audio-Textual Corpus and a GRU/BiLSTM-based Model ...
Shen, Ying; Yang, Huiyu; Lin, Lin. - : arXiv, 2022
BASE
Show details

Page: 1 2 3 4 5 6 7...50

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
989
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern