DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6 7 8...91
Hits 61 – 80 of 1.811

61
CTA-RNN: Channel and Temporal-wise Attention RNN Leveraging Pre-trained ASR Embeddings for Speech Emotion Recognition ...
Chen, Chengxin; Zhang, Pengyuan. - : arXiv, 2022
BASE
Show details
62
Automatic Depression Detection: An Emotional Audio-Textual Corpus and a GRU/BiLSTM-based Model ...
Shen, Ying; Yang, Huiyu; Lin, Lin. - : arXiv, 2022
BASE
Show details
63
Fine-grained Noise Control for Multispeaker Speech Synthesis ...
BASE
Show details
64
Emotion Intensity and its Control for Emotional Voice Conversion ...
Zhou, Kun; Sisman, Berrak; Rana, Rajib. - : arXiv, 2022
BASE
Show details
65
Automatic Speech recognition for Speech Assessment of Preschool Children ...
BASE
Show details
66
The HCCL-DKU system for fake audio generation task of the 2022 ICASSP ADD Challenge ...
Abstract: The voice conversion task is to modify the speaker identity of continuous speech while preserving the linguistic content. Generally, the naturalness and similarity are two main metrics for evaluating the conversion quality, which has been improved significantly in recent years. This paper presents the HCCL-DKU entry for the fake audio generation task of the 2022 ICASSP ADD challenge. We propose a novel ppg-based voice conversion model that adopts a fully end-to-end structure. Experimental results show that the proposed method outperforms other conversion models, including Tacotron-based and Fastspeech-based models, on conversion quality and spoofing performance against anti-spoofing systems. In addition, we investigate several post-processing methods for better spoofing power. Finally, we achieve second place with a deception success rate of 0.916 in the ADD challenge. ...
Keyword: Audio and Speech Processing eess.AS; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
URL: https://dx.doi.org/10.48550/arxiv.2201.12567
https://arxiv.org/abs/2201.12567
BASE
Hide details
67
Dawn of the transformer era in speech emotion recognition: closing the valence gap ...
BASE
Show details
68
Deep Speech Based End-to-End Automated Speech Recognition (ASR) for Indian-English Accents ...
Dubey, Priyank; Shah, Bilal. - : arXiv, 2022
BASE
Show details
69
KazakhTTS2: Extending the Open-Source Kazakh TTS Corpus With More Data, Speakers, and Topics ...
BASE
Show details
70
Automated speech tools for helping communities process restricted-access corpora for language revival efforts ...
BASE
Show details
71
Classifying Autism from Crowdsourced Semi-Structured Speech Recordings: A Machine Learning Approach ...
BASE
Show details
72
Learning English with Peppa Pig ...
BASE
Show details
73
Language-Independent Speaker Anonymization Approach using Self-Supervised Pre-Trained Models ...
BASE
Show details
74
Separate What You Describe: Language-Queried Audio Source Separation ...
Liu, Xubo; Liu, Haohe; Kong, Qiuqiang. - : arXiv, 2022
BASE
Show details
75
A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition ...
Du, Ye-Qian; Zhang, Jie; Zhu, Qiu-Shi. - : arXiv, 2022
BASE
Show details
76
DRSpeech: Degradation-Robust Text-to-Speech Synthesis with Frame-Level and Utterance-Level Acoustic Representation Learning ...
BASE
Show details
77
Arabic Text-To-Speech (TTS) Data Preparation ...
BASE
Show details
78
Hedy Lamarr and Frequency Hopping ...
Đurić D., Miloš. - : Zenodo, 2022
BASE
Show details
79
Impact of Naturalistic Field Acoustic Environments on Forensic Text-independent Speaker Verification System ...
Wang, Zhenyu; Hansen, John H. L.. - : arXiv, 2022
BASE
Show details
80
Inferring Pitch from Coarse Spectral Features ...
BASE
Show details

Page: 1 2 3 4 5 6 7 8...91

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
1.811
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern