DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...7
Hits 1 – 20 of 124

1
Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition ...
BASE
Show details
2
Multilingual and Multimodal Abuse Detection ...
BASE
Show details
3
WavThruVec: Latent speech representation as intermediate features for neural speech synthesis ...
BASE
Show details
4
Fine-grained Noise Control for Multispeaker Speech Synthesis ...
BASE
Show details
5
Emotion Intensity and its Control for Emotional Voice Conversion ...
Zhou, Kun; Sisman, Berrak; Rana, Rajib. - : arXiv, 2022
BASE
Show details
6
Low-dimensional representation of infant and adult vocalization acoustics ...
BASE
Show details
7
Chain-based Discriminative Autoencoders for Speech Recognition ...
BASE
Show details
8
Filter-based Discriminative Autoencoders for Children Speech Recognition ...
BASE
Show details
9
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling ...
Peng, Puyuan; Harwath, David. - : arXiv, 2022
BASE
Show details
10
Continual Learning for Monolingual End-to-End Automatic Speech Recognition ...
BASE
Show details
11
Cetacean Translation Initiative: a roadmap to deciphering the communication of sperm whales ...
BASE
Show details
12
Speech Representations and Phoneme Classification for Preserving the Endangered Language of Ladin ...
Durante, Zane; Mathur, Leena; Ye, Eric. - : arXiv, 2021
BASE
Show details
13
Applying Phonological Features in Multilingual Text-To-Speech ...
Zhang, Cong; Zeng, Huinan; Liu, Huang. - : arXiv, 2021
BASE
Show details
14
English Accent Accuracy Analysis in a State-of-the-Art Automatic Speech Recognition System ...
BASE
Show details
15
Cross-lingual Low Resource Speaker Adaptation Using Phonological Features ...
Abstract: The idea of using phonological features instead of phonemes as input to sequence-to-sequence TTS has been recently proposed for zero-shot multilingual speech synthesis. This approach is useful for code-switching, as it facilitates the seamless uttering of foreign text embedded in a stream of native text. In our work, we train a language-agnostic multispeaker model conditioned on a set of phonologically derived features common across different languages, with the goal of achieving cross-lingual speaker adaptation. We first experiment with the effect of language phonological similarity on cross-lingual TTS of several source-target language combinations. Subsequently, we fine-tune the model with very limited data of a new speaker's voice in either a seen or an unseen language, and achieve synthetic speech of equal quality, while preserving the target speaker's identity. With as few as 32 and 8 utterances of target speaker data, we obtain high speaker similarity scores and naturalness comparable to the ... : Proceedings of INTERSPEECH 2021 ...
Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
URL: https://dx.doi.org/10.48550/arxiv.2111.09075
https://arxiv.org/abs/2111.09075
BASE
Hide details
16
Arabic Speech Recognition by End-to-End, Modular Systems and Human ...
BASE
Show details
17
The INTERSPEECH 2021 Computational Paralinguistics Challenge: COVID-19 Cough, COVID-19 Speech, Escalation & Primates ...
BASE
Show details
18
Discrete representations in neural models of spoken language ...
BASE
Show details
19
Phrase break prediction with bidirectional encoder representations in Japanese text-to-speech synthesis ...
BASE
Show details
20
Learning De-identified Representations of Prosody from Raw Audio ...
BASE
Show details

Page: 1 2 3 4 5...7

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
124
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern