Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6 7 8...567

Hits 61 – 80 of 11.324

61	A Hierarchical Model for Spoken Language Recognition ...
	Ferrer, Luciana; Castan, Diego; McLaren, Mitchell. - : arXiv, 2022
	BASE
	Show details

62	Hierarchical Softmax for End-to-End Low-resource Multilingual Speech Recognition ...
	Liu, Qianying; Yang, Yuhang; Gong, Zhuo. - : arXiv, 2022
	BASE
	Show details

63	Cross-lingual Self-Supervised Speech Representations for Improved Dysarthric Speech Recognition ...
	Hernandez, Abner; Pérez-Toro, Paula Andrea; Nöth, Elmar; Orozco-Arroyave, Juan Rafael; Maier, Andreas; Yang, Seung Hee. - : arXiv, 2022
	Abstract: State-of-the-art automatic speech recognition (ASR) systems perform well on healthy speech. However, the performance on impaired speech still remains an issue. The current study explores the usefulness of using Wav2Vec self-supervised speech representations as features for training an ASR system for dysarthric speech. Dysarthric speech recognition is particularly difficult as several aspects of speech such as articulation, prosody and phonation can be impaired. Specifically, we train an acoustic model with features extracted from Wav2Vec, Hubert, and the cross-lingual XLSR model. Results suggest that speech representations pretrained on large unlabelled data can improve word error rate (WER) performance. In particular, features from the multilingual model led to lower WERs than filterbanks (Fbank) or models trained on a single language. Improvements were observed in English speakers with cerebral palsy caused dysarthria (UASpeech corpus), Spanish speakers with Parkinsonian dysarthria (PC-GITA corpus) and ... : Submitted for review at Interspeech 2022 ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
	URL: https://dx.doi.org/10.48550/arxiv.2204.01670 https://arxiv.org/abs/2204.01670
	BASE
	Hide details

64	Multilingual Simultaneous Speech Translation ...
	Subramanya, Shashank; Niehues, Jan. - : arXiv, 2022
	BASE
	Show details

65	Code-Switching Text Augmentation for Multilingual Speech Processing ...
	Hussein, Amir; Chowdhury, Shammur Absar; Abdelali, Ahmed. - : arXiv, 2022
	BASE
	Show details

66	Multilingual and Multimodal Abuse Detection ...
	Sharon, Rini; Shah, Heet; Mukherjee, Debdoot. - : arXiv, 2022
	BASE
	Show details

67	Self-supervised Learning with Random-projection Quantizer for Speech Recognition ...
	Chiu, Chung-Cheng; Qin, James; Zhang, Yu. - : arXiv, 2022
	BASE
	Show details

68	BEA-Base: A Benchmark for ASR of Spontaneous Hungarian ...
	Mihajlik, P.; Balog, A.; Gráczi, T. E.. - : arXiv, 2022
	BASE
	Show details

69	CVSS Corpus and Massively Multilingual Speech-to-Speech Translation ...
	Jia, Ye; Ramanovich, Michelle Tadmor; Wang, Quan. - : arXiv, 2022
	BASE
	Show details

70	ADIMA: Abuse Detection In Multilingual Audio ...
	Gupta, Vikram; Sharon, Rini; Sawhney, Ramit. - : arXiv, 2022
	BASE
	Show details

71	Improving the fusion of acoustic and text representations in RNN-T ...
	Zhang, Chao; Li, Bo; Lu, Zhiyun. - : arXiv, 2022
	BASE
	Show details

72	Data and knowledge-driven approaches for multilingual training to improve the performance of speech recognition systems of Indian languages ...
	Madhavaraj, A.; Ganesan, Ramakrishnan Angarai. - : arXiv, 2022
	BASE
	Show details

73	Tackling data scarcity in speech translation using zero-shot multilingual machine translation techniques ...
	Dinh, Tu Anh; Liu, Danni; Niehues, Jan. - : arXiv, 2022
	BASE
	Show details

74	Cross-view Brain Decoding ...
	Oota, Subba Reddy; Arora, Jashn; Gupta, Manish. - : arXiv, 2022
	BASE
	Show details

75	Automatic Speech Recognition Datasets in Cantonese: A Survey and New Dataset ...
	Yu, Tiezheng; Frieske, Rita; Xu, Peng. - : arXiv, 2022
	BASE
	Show details

76	WavThruVec: Latent speech representation as intermediate features for neural speech synthesis ...
	Siuzdak, Hubert; Dura, Piotr; van Rijn, Pol. - : arXiv, 2022
	BASE
	Show details

77	Knowledge Transfer from Large-scale Pretrained Language Models to End-to-end Speech Recognizers ...
	Kubo, Yotaro; Karita, Shigeki; Bacchiani, Michiel. - : arXiv, 2022
	BASE
	Show details

78	The VoicePrivacy 2022 Challenge Evaluation Plan ...
	Tomashenko, Natalia; Wang, Xin; Miao, Xiaoxiao. - : arXiv, 2022
	BASE
	Show details

79	A Character-level Span-based Model for Mandarin Prosodic Structure Prediction ...
	Chen, Xueyuan; Song, Changhe; Zhou, Yixuan. - : arXiv, 2022
	BASE
	Show details

80	Fine-grained Noise Control for Multispeaker Speech Synthesis ...
	Nikitaras, Karolos; Vamvoukakis, Georgios; Ellinas, Nikolaos. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5 6 7 8...567

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern