DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 27

1
Learning and controlling the source-filter representation of speech with a variational autoencoder
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
BASE
Show details
2
Learning and controlling the source-filter representation of speech with a variational autoencoder ...
BASE
Show details
3
Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation ...
BASE
Show details
4
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
BASE
Show details
5
Alternate Endings: Improving Prosody for Incremental Neural TTS with Predicted Future Text Input
In: Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03372802 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic. pp.3865-3869, ⟨10.21437/Interspeech.2021-275⟩ (2021)
BASE
Show details
6
Learning robust speech representation with an articulatory-regularized variational autoencoder
In: Proccedings of Interspeech 2021 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03373252 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
7
Learning robust speech representation with an articulatory-regularized variational autoencoder ...
Abstract: It is increasingly considered that human speech perception and production both rely on articulatory representations. In this paper, we investigate whether this type of representation could improve the performances of a deep generative model (here a variational autoencoder) trained to encode and decode acoustic speech features. First we develop an articulatory model able to associate articulatory parameters describing the jaw, tongue, lips and velum configurations with vocal tract shapes and spectral features. Then we incorporate these articulatory parameters into a variational autoencoder applied on spectral features by using a regularization technique that constraints part of the latent space to follow articulatory trajectories. We show that this articulatory constraint improves model training by decreasing time to convergence and reconstruction loss at convergence, and yields better performance in a speech denoising task. ...
Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
URL: https://dx.doi.org/10.48550/arxiv.2104.03204
https://arxiv.org/abs/2104.03204
BASE
Hide details
8
Towards an articulatory-driven neural vocoder for speech synthesis
In: ISSP 2020 - 12th International Seminar on Speech Production ; https://hal.archives-ouvertes.fr/hal-03184762 ; ISSP 2020 - 12th International Seminar on Speech Production, Dec 2020, Providence (virtual), United States (2020)
BASE
Show details
9
Evaluating the Potential Gain of Auditory and Audiovisual Speech-Predictive Coding Using Deep Learning
In: ISSN: 0899-7667 ; EISSN: 1530-888X ; Neural Computation ; https://hal.archives-ouvertes.fr/hal-03016083 ; Neural Computation, Massachusetts Institute of Technology Press (MIT Press), 2020, 32 (3), pp.596-625. ⟨10.1162/neco_a_01264⟩ (2020)
BASE
Show details
10
Deeppredspeech: Computational Models Of Predictive Speech Coding Based On Deep Learning ...
BASE
Show details
11
DeepPredSpeech: computational models of predictive speech coding based on deep learning ...
BASE
Show details
12
DeepPredSpeech: computational models of predictive speech coding based on deep learning ...
BASE
Show details
13
Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping
In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.archives-ouvertes.fr/hal-01485540 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2017, 25 (3), pp.662-673. ⟨10.1109/TASLP.2017.2651398⟩ (2017)
BASE
Show details
14
Automatic animation of an articulatory tongue model from ultrasound images of the vocal tract
In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.archives-ouvertes.fr/hal-01578315 ; Speech Communication, Elsevier : North-Holland, 2017, 93, pp.63 - 75. ⟨10.1016/j.specom.2017.08.002⟩ (2017)
BASE
Show details
15
Voice Activity Detection Based on Statistical Likelihood Ratio With Adaptive Thresholding
In: IWAENC 2016 - International Workshop on Acoustic Signal Enhancement (IWAENC) ; https://hal.inria.fr/hal-01349776 ; IWAENC 2016 - International Workshop on Acoustic Signal Enhancement (IWAENC), Sep 2016, Xi'an, China. pp.1-5, ⟨10.1109/IWAENC.2016.7602911⟩ (2016)
BASE
Show details
16
Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces
In: ISSN: 1553-734X ; EISSN: 1553-7358 ; PLoS Computational Biology ; https://hal.archives-ouvertes.fr/hal-01459706 ; PLoS Computational Biology, Public Library of Science, 2016, 12 (11), pp.e1005119. ⟨10.1371/journal.pcbi.1005119⟩ (2016)
BASE
Show details
17
By2014 Articulatory-Acoustic Dataset ...
BASE
Show details
18
Real-Time Control of an Articulatory-Based Speech Synthesizer for Brain Computer Interfaces
Bocquelet, Florent; Hueber, Thomas; Girin, Laurent. - : Public Library of Science, 2016
BASE
Show details
19
Log-Rayleigh distribution: a simple and efficient statistical representation of log-spectral coefficients
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 15 (2007) 3, 796-802
BLLDB
Show details
20
Perceptual long-term variable-rate sinusoidal modeling of speech
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 15 (2007) 3, 851-861
BLLDB
Show details

Page: 1 2

Catalogues
0
0
2
0
0
0
0
Bibliographies
7
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
19
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern