DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...46
Hits 1 – 20 of 906

1
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
BASE
Show details
2
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
BASE
Show details
3
Learning and controlling the source-filter representation of speech with a variational autoencoder
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
BASE
Show details
4
A comparative study of several parameterizations for speaker recognition ...
Faundez-Zanuy, Marcos. - : arXiv, 2022
BASE
Show details
5
Speaker verification in mismatch training and testing conditions ...
BASE
Show details
6
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation ...
BASE
Show details
7
A New Amharic Speech Emotion Dataset and Classification Benchmark ...
BASE
Show details
8
The Norwegian Parliamentary Speech Corpus ...
Solberg, Per Erik; Ortiz, Pablo. - : arXiv, 2022
BASE
Show details
9
Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition ...
BASE
Show details
10
LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects ...
BASE
Show details
11
Automatic Dialect Density Estimation for African American English ...
BASE
Show details
12
End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system ...
Zhang, Zhengyi; Zhou, Pan. - : arXiv, 2022
BASE
Show details
13
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems ...
BASE
Show details
14
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation ...
BASE
Show details
15
Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations ...
BASE
Show details
16
Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
BASE
Show details
17
Learning and controlling the source-filter representation of speech with a variational autoencoder ...
Abstract: Understanding and controlling latent representations in deep generative models is a challenging yet important problem for analyzing, transforming and generating various types of data. In speech processing, inspiring from the anatomical mechanisms of phonation, the source-filter model considers that speech signals are produced from a few independent and physically meaningful continuous latent factors, among which the fundamental frequency $f_0$ and the formants are of primary importance. In this work, we show that the source-filter model of speech production naturally arises in the latent space of a variational autoencoder (VAE) trained in an unsupervised manner on a dataset of natural speech signals. Using only a few seconds of labeled speech signals generated with an artificial speech synthesizer, we experimentally illustrate that $f_0$ and the formant frequencies are encoded in orthogonal subspaces of the VAE latent space and we develop a weakly-supervised method to accurately and independently control ... : 17 pages, 4 figures, companion website: https://samsad35.github.io/site-sfvae/ ...
Keyword: Audio and Speech Processing eess.AS; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
URL: https://arxiv.org/abs/2204.07075
https://dx.doi.org/10.48550/arxiv.2204.07075
BASE
Hide details
18
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation ...
BASE
Show details
19
Can Social Robots Effectively Elicit Curiosity in STEM Topics from K-1 Students During Oral Assessments? ...
BASE
Show details
20
Expression-preserving face frontalization improves visually assisted speech processing ...
BASE
Show details

Page: 1 2 3 4 5...46

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
906
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern