1 |
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Learning and controlling the source-filter representation of speech with a variational autoencoder
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
A comparative study of several parameterizations for speaker recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Speaker verification in mismatch training and testing conditions ...
|
|
|
|
Abstract:
This paper presents an exhaustive study about the robustness of several parameterizations, with a new database specially acquired for the purpose of a speaker recognition application. This database includes the following variations: different recording sessions (including telephonic and microphonic recordings), recording rooms, and languages (it has been obtained from a bilingual set of speakers). This study has been performed with covariance matrices in a text independent speaker verification application. It reveals that the combination of several parameterizations can improve the robustness in all the scenarios. ... : 4 pages, published in 6th international conference on spoken language processing (ICSLP 2000), Vol. II, pp.322-325. ICSLP 2000, ISBN 7-80150-144-4/G.18Beijing (China). October 16-20, 2000. arXiv admin note: substantial text overlap with arXiv:2203.00513 ...
|
|
Keyword:
Audio and Speech Processing eess.AS; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
|
|
URL: https://arxiv.org/abs/2204.00311 https://dx.doi.org/10.48550/arxiv.2204.00311
|
|
BASE
|
|
Hide details
|
|
6 |
Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
A New Amharic Speech Emotion Dataset and Classification Benchmark ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Automatic Dialect Density Estimation for African American English ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Learning and controlling the source-filter representation of speech with a variational autoencoder ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Can Social Robots Effectively Elicit Curiosity in STEM Topics from K-1 Students During Oral Assessments? ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Expression-preserving face frontalization improves visually assisted speech processing ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|