Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...50

Hits 1 – 20 of 989

1	A comparative study of several parameterizations for speaker recognition ...
	Faundez-Zanuy, Marcos. - : arXiv, 2022
	BASE
	Show details

2	Speaker verification in mismatch training and testing conditions ...
	Faundez-Zanuy, Marcos; Slupinski, Adam. - : arXiv, 2022
	BASE
	Show details

3	Speech Segmentation Optimization using Segmented Bilingual Speech Corpus for End-to-end Speech Translation ...
	Fukuda, Ryo; Sudoh, Katsuhito; Nakamura, Satoshi. - : arXiv, 2022
	Abstract: Speech segmentation, which splits long speech into short segments, is essential for speech translation (ST). Popular VAD tools like WebRTC VAD have generally relied on pause-based segmentation. Unfortunately, pauses in speech do not necessarily match sentence boundaries, and sentences can be connected by a very short pause that is difficult to detect by VAD. In this study, we propose a speech segmentation method using a binary classification model trained using a segmented bilingual speech corpus. We also propose a hybrid method that combines VAD and the above speech segmentation method. Experimental results revealed that the proposed method is more suitable for cascade and end-to-end ST systems than conventional segmentation methods. The hybrid approach further improved the translation performance. ... : Submitted to INTERSPEECH 2022 ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
	URL: https://arxiv.org/abs/2203.15479 https://dx.doi.org/10.48550/arxiv.2203.15479
	BASE
	Hide details

4	A New Amharic Speech Emotion Dataset and Classification Benchmark ...
	Retta, Ephrem A.; Almekhlafi, Eiad; Sutcliffe, Richard. - : arXiv, 2022
	BASE
	Show details

5	Lahjoita puhetta -- a large-scale corpus of spoken Finnish with some benchmarks ...
	Moisio, Anssi; Porjazovski, Dejan; Rouhe, Aku. - : arXiv, 2022
	BASE
	Show details

6	The Norwegian Parliamentary Speech Corpus ...
	Solberg, Per Erik; Ortiz, Pablo. - : arXiv, 2022
	BASE
	Show details

7	Subspace-based Representation and Learning for Phonotactic Spoken Language Recognition ...
	Lee, Hung-Shin; Tsao, Yu; Jeng, Shyh-Kang. - : arXiv, 2022
	BASE
	Show details

8	LPC Augment: An LPC-Based ASR Data Augmentation Algorithm for Low and Zero-Resource Children's Dialects ...
	Johnson, Alexander; Fan, Ruchao; Morris, Robin. - : arXiv, 2022
	BASE
	Show details

9	Automatic Dialect Density Estimation for African American English ...
	Johnson, Alexander; Everson, Kevin; Ravi, Vijay. - : arXiv, 2022
	BASE
	Show details

10	End-to-end contextual asr based on posterior distribution adaptation for hybrid ctc/attention system ...
	Zhang, Zhengyi; Zhou, Pan. - : arXiv, 2022
	BASE
	Show details

11	Towards Contextual Spelling Correction for Customization of End-to-end Speech Recognition Systems ...
	Wang, Xiaoqiang; Liu, Yanqing; Li, Jinyu. - : arXiv, 2022
	BASE
	Show details

12	SHAS: Approaching optimal Segmentation for End-to-End Speech Translation ...
	Tsiamas, Ioannis; Gállego, Gerard I.; Fonollosa, José A. R.. - : arXiv, 2022
	BASE
	Show details

13	Automatic Detection of Speech Sound Disorder in Child Speech Using Posterior-based Speaker Representations ...
	Ng, Si-Ioi; Ng, Cymie Wing-Yee; Wang, Jiarui. - : arXiv, 2022
	BASE
	Show details

14	Deep Neural Convolutive Matrix Factorization for Articulatory Representation Decomposition ...
	Lian, Jiachen; Black, Alan W; Goldstein, Louis. - : arXiv, 2022
	BASE
	Show details

15	Telepractice treatment of rhotics (Peterson et al., 2022) ...
	Peterson, Laura; Savarese, Christian; Campbell, Twylah. - : ASHA journals, 2022
	BASE
	Show details

16	Telepractice treatment of rhotics (Peterson et al., 2022) ...
	Peterson, Laura; Savarese, Christian; Campbell, Twylah. - : ASHA journals, 2022
	BASE
	Show details

17	Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
	Aldeneh, Zakaria; Fedzechkina, Masha; Seto, Skyler. - : arXiv, 2022
	BASE
	Show details

18	Learning and controlling the source-filter representation of speech with a variational autoencoder ...
	Sadok, Samir; Leglaive, Simon; Girin, Laurent. - : arXiv, 2022
	BASE
	Show details

19	Correcting Misproducted Speech using Spectrogram Inpainting ...
	Ben-Simon, Talia; Kreuk, Felix; Awwad, Faten. - : arXiv, 2022
	BASE
	Show details

20	Decoding Neural Correlation of Language-Specific Imagined Speech using EEG Signals ...
	Lee, Keon-Woo; Lee, Dae-Hyeok; Kim, Sung-Jin. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5...50

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern