Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6 7 8 9...50

Hits 81 – 100 of 989

81	Common Phone: A Multilingual Dataset for Robust Acoustic Modelling ...
	Klumpp, Philipp; Arias-Vergara, Tomás; Pérez-Toro, Paula Andrea. - : arXiv, 2022
	BASE
	Show details

82	Polyphone disambiguation and accent prediction using pre-trained language models in Japanese TTS front-end ...
	Hida, Rem; Hamada, Masaki; Kamada, Chie. - : arXiv, 2022
	BASE
	Show details

83	Low-dimensional representation of infant and adult vocalization acoustics ...
	Pagliarini, Silvia; Schneider, Sara; Kello, Christopher T.. - : arXiv, 2022
	BASE
	Show details

84	Dual-Decoder Transformer For end-to-end Mandarin Chinese Speech Recognition with Pinyin and Character ...
	Yang, Zhao; Xi, Wei; Wang, Rui. - : arXiv, 2022
	BASE
	Show details

85	Importance of Different Temporal Modulations of Speech: A Tale of Two Perspectives ...
	Sadhu, Samik; Hermansky, Hynek. - : arXiv, 2022
	BASE
	Show details

86	Leveraging Phone Mask Training for Phonetic-Reduction-Robust E2E Uyghur Speech Recognition ...
	Ma, Guodong; Hu, Pengfei; Kang, Jian. - : arXiv, 2022
	BASE
	Show details

87	Similarity and Content-based Phonetic Self Attention for Speech Recognition ...
	Shim, Kyuhong; Sung, Wonyong. - : arXiv, 2022
	BASE
	Show details

88	BERT-LID: Leveraging BERT to Improve Spoken Language Identification ...
	Nie, Yuting; Zhao, Junhong; Zhang, Wei-Qiang. - : arXiv, 2022
	BASE
	Show details

89	Chain-based Discriminative Autoencoders for Speech Recognition ...
	Lee, Hung-Shin; Huang, Pin-Tuan; Cheng, Yao-Fei. - : arXiv, 2022
	BASE
	Show details

90	Building Robust Spoken Language Understanding by Cross Attention between Phoneme Sequence and ASR Hypothesis ...
	Wang, Zexun; Le, Yuquan; Zhu, Yi. - : arXiv, 2022
	BASE
	Show details

91	STRATA: Word Boundaries & Phoneme Recognition From Continuous Urdu Speech using Transfer Learning, Attention, & Data Augmentation ...
	Naeem, Saad; Beg, Omer. - : arXiv, 2022
	BASE
	Show details

92	Three-Module Modeling For End-to-End Spoken Language Understanding Using Pre-trained DNN-HMM-Based Acoustic-Phonetic Model ...
	Wang, Nick J. C.; Wang, Lu; Sun, Yandan. - : arXiv, 2022
	BASE
	Show details

93	Speech segmentation using multilevel hybrid filters ...
	Faundez-Zanuy, Marcos; Vallverdu-Bayes, Francesc. - : arXiv, 2022
	BASE
	Show details

94	On the relevance of language in speaker recognition ...
	Satue-Villar, Antonio; Faundez-Zanuy, Marcos. - : arXiv, 2022
	BASE
	Show details

95	Improving speaker de-identification with functional data analysis of f0 trajectories ...
	Tavi, Lauri; Kinnunen, Tomi; Hautamäki, Rosa González. - : arXiv, 2022
	BASE
	Show details

96	Unsupervised word-level prosody tagging for controllable speech synthesis ...
	Guo, Yiwei; Du, Chenpeng; Yu, Kai. - : arXiv, 2022
	BASE
	Show details

97	Filter-based Discriminative Autoencoders for Children Speech Recognition ...
	Tai, Chiang-Lin; Lee, Hung-Shin; Tsao, Yu. - : arXiv, 2022
	BASE
	Show details

98	Transducer-based language embedding for spoken language identification ...
	Shen, Peng; Lu, Xugang; Kawai, Hisashi. - : arXiv, 2022
	BASE
	Show details

99	Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0 ...
	Bayerl, Sebastian P.; Wagner, Dominik; Nöth, Elmar; Riedhammer, Korbinian. - : arXiv, 2022
	Abstract: Stuttering is a varied speech disorder that harms an individual's communication ability. Persons who stutter (PWS) often use speech therapy to cope with their condition. Improving speech recognition systems for people with such non-typical speech or tracking the effectiveness of speech therapy would require systems that can detect dysfluencies while at the same time being able to detect speech techniques acquired in therapy. This paper shows that fine-tuning wav2vec 2.0 for the classification of stuttering on a sizeable English corpus containing stuttered speech, in conjunction with multi-task learning, boosts the effectiveness of the general-purpose wav2vec 2.0 features for detecting stuttering in speech; both within and across languages. We evaluate our method on Fluencybank and the German therapy-centric Kassel State of Fluency (KSoF) dataset by training Support Vector Machine classifiers using features extracted from the fine-tuned models for six different stuttering-related events types: blocks, ... : Submitted to Interspeech 2022 ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering
	URL: https://dx.doi.org/10.48550/arxiv.2204.03417 https://arxiv.org/abs/2204.03417
	BASE
	Hide details

100	Multi-sequence Intermediate Conditioning for CTC-based ASR ...
	Fujita, Yusuke; Komatsu, Tatsuya; Kida, Yusuke. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5 6 7 8 9...50

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern