Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...40

Hits 1 – 20 of 783

1	A comparative study of several parameterizations for speaker recognition ...
	Faundez-Zanuy, Marcos. - : arXiv, 2022
	BASE
	Show details

2	Speaker verification in mismatch training and testing conditions ...
	Faundez-Zanuy, Marcos; Slupinski, Adam. - : arXiv, 2022
	BASE
	Show details

3	Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation ...
	Fukuda, Ryo; Sudoh, Katsuhito; Nakamura, Satoshi. - : arXiv, 2022
	BASE
	Show details

4	A New Amharic Speech Emotion Dataset and Classification Benchmark ...
	Retta, Ephrem A.; Almekhlafi, Eiad; Sutcliffe, Richard. - : arXiv, 2022
	BASE
	Show details

5	Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks ...
	Moisio, Anssi; Porjazovski, Dejan; Rouhe, Aku. - : arXiv, 2022
	BASE
	Show details

6	The Norwegian Parliamentary Speech Corpus ...
	Solberg, Per Erik; Ortiz, Pablo. - : arXiv, 2022
	BASE
	Show details

7	Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition ...
	Lee, Hung-Shin; Tsao, Yu; Jeng, Shyh-Kang. - : arXiv, 2022
	BASE
	Show details

8	LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects ...
	Johnson, Alexander; Fan, Ruchao; Morris, Robin. - : arXiv, 2022
	BASE
	Show details

9	Automatic Dialect Density Estimation for African American English ...
	Johnson, Alexander; Everson, Kevin; Ravi, Vijay. - : arXiv, 2022
	BASE
	Show details

10	End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system ...
	Zhang, Zhengyi; Zhou, Pan. - : arXiv, 2022
	BASE
	Show details

11	Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems ...
	Wang, Xiaoqiang; Liu, Yanqing; Li, Jinyu. - : arXiv, 2022
	BASE
	Show details

12	SHAS: Approaching optimal Segmentation for End-to-End Speech Translation ...
	Tsiamas, Ioannis; Gállego, Gerard I.; Fonollosa, José A. R.. - : arXiv, 2022
	BASE
	Show details

13	Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations ...
	Ng, Si-Ioi; Ng, Cymie Wing-Yee; Wang, Jiarui. - : arXiv, 2022
	BASE
	Show details

14	Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition ...
	Lian, Jiachen; Black, Alan W; Goldstein, Louis. - : arXiv, 2022
	BASE
	Show details

15	Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
	Aldeneh, Zakaria; Fedzechkina, Masha; Seto, Skyler. - : arXiv, 2022
	BASE
	Show details

16	Learning and controlling the source-filter representation of speech with a variational autoencoder ...
	Sadok, Samir; Leglaive, Simon; Girin, Laurent; Alameda-Pineda, Xavier; Séguier, Renaud. - : arXiv, 2022
	Abstract: Understanding and controlling latent representations in deep generative models is a challenging yet important problem for analyzing, transforming and generating various types of data. In speech processing, inspiring from the anatomical mechanisms of phonation, the source-filter model considers that speech signals are produced from a few independent and physically meaningful continuous latent factors, among which the fundamental frequency $f_0$ and the formants are of primary importance. In this work, we show that the source-filter model of speech production naturally arises in the latent space of a variational autoencoder (VAE) trained in an unsupervised manner on a dataset of natural speech signals. Using only a few seconds of labeled speech signals generated with an artificial speech synthesizer, we experimentally illustrate that $f_0$ and the formant frequencies are encoded in orthogonal subspaces of the VAE latent space and we develop a weakly-supervised method to accurately and independently control ... : 17 pages, 4 figures, companion website: https://samsad35.github.io/site-sfvae/ ...
	Keyword: Audio and Speech Processing eess.AS; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
	URL: https://arxiv.org/abs/2204.07075 https://dx.doi.org/10.48550/arxiv.2204.07075
	BASE
	Hide details

17	Correcting Misproducted Speech using Spectrogram Inpainting ...
	Ben-Simon, Talia; Kreuk, Felix; Awwad, Faten. - : arXiv, 2022
	BASE
	Show details

18	Repeat after me: Self-supervised learning of acoustic-to-articulatory mapping by vocal imitation ...
	Georges, Marc-Antoine; Diard, Julien; Girin, Laurent. - : arXiv, 2022
	BASE
	Show details

19	Can Social Robots Effectively Elicit Curiosity in STEM Topics from K-1 Students During Oral Assessments? ...
	Johnson, Alexander; Martin, Alejandra; Quintero, Marlen. - : arXiv, 2022
	BASE
	Show details

20	An error correction scheme for improved air-tissue boundary in real-time MRI video for speech production ...
	Roy, Anwesha; Belagali, Varun; Ghosh, Prasanta Kumar. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5...40

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern