Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6 7 8 9 10...158

Hits 101 – 120 of 3.146

101	Task and language in Spanish–English narratives (Wofford et al., 2022) ...
	Wofford, Mary Claire; Cano, Jessica; Goodrich, J. Marc. - : ASHA journals, 2022
	BASE
	Show details

102	Task and language in Spanish–English narratives (Wofford et al., 2022) ...
	Wofford, Mary Claire; Cano, Jessica; Goodrich, J. Marc. - : ASHA journals, 2022
	BASE
	Show details

103	Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0 ...
	Bayerl, Sebastian P.; Wagner, Dominik; Nöth, Elmar; Riedhammer, Korbinian. - : arXiv, 2022
	Abstract: Stuttering is a varied speech disorder that harms an individual's communication ability. Persons who stutter (PWS) often use speech therapy to cope with their condition. Improving speech recognition systems for people with such non-typical speech or tracking the effectiveness of speech therapy would require systems that can detect dysfluencies while at the same time being able to detect speech techniques acquired in therapy. This paper shows that fine-tuning wav2vec 2.0 for the classification of stuttering on a sizeable English corpus containing stuttered speech, in conjunction with multi-task learning, boosts the effectiveness of the general-purpose wav2vec 2.0 features for detecting stuttering in speech; both within and across languages. We evaluate our method on Fluencybank and the German therapy-centric Kassel State of Fluency (KSoF) dataset by training Support Vector Machine classifiers using features extracted from the fine-tuned models for six different stuttering-related events types: blocks, ... : Submitted to Interspeech 2022 ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering
	URL: https://dx.doi.org/10.48550/arxiv.2204.03417 https://arxiv.org/abs/2204.03417
	BASE
	Hide details

104	Multi-sequence Intermediate Conditioning for CTC-based ASR ...
	Fujita, Yusuke; Komatsu, Tatsuya; Kida, Yusuke. - : arXiv, 2022
	BASE
	Show details

105	Code Switched and Code Mixed Speech Recognition for Indic languages ...
	Chadha, Harveen Singh; Shah, Priyanshi; Dhuriya, Ankur. - : arXiv, 2022
	BASE
	Show details

106	Simple and Effective Unsupervised Speech Synthesis ...
	Liu, Alexander H.; Lai, Cheng-I Jeff; Hsu, Wei-Ning. - : arXiv, 2022
	BASE
	Show details

107	Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding ...
	Sankar, Sanjana; Beautemps, Denis; Hueber, Thomas. - : arXiv, 2022
	BASE
	Show details

108	Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech ...
	Zhang, Cong; Zeng, Huinan; Liu, Huang. - : arXiv, 2022
	BASE
	Show details

109	Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling ...
	Peng, Puyuan; Harwath, David. - : arXiv, 2022
	BASE
	Show details

110	CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations ...
	Sachidananda, Vin; Tseng, Shao-Yen; Marchi, Erik. - : arXiv, 2022
	BASE
	Show details

111	Enhance Language Identification using Dual-mode Model with Knowledge Distillation ...
	Liu, Hexin; Perera, Leibny Paola Garcia; Khong, Andy W. H.. - : arXiv, 2022
	BASE
	Show details

112	MAESTRO: Matched Speech Text Representations through Modality Matching ...
	Chen, Zhehuai; Zhang, Yu; Rosenberg, Andrew. - : arXiv, 2022
	BASE
	Show details

113	Improving Language Identification of Accented Speech ...
	Kukk, Kunnar; Alumäe, Tanel. - : arXiv, 2022
	BASE
	Show details

114	Cross-stitched Multi-modal Encoders ...
	Singla, Karan; Pressel, Daniel; Price, Ryan. - : arXiv, 2022
	BASE
	Show details

115	Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information ...
	Fu, Kaiqi; Gao, Shaojun; Wang, Kai. - : arXiv, 2022
	BASE
	Show details

116	Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems ...
	Udagawa, Takuma; Suzuki, Masayuki; Kurata, Gakuto. - : arXiv, 2022
	BASE
	Show details

117	ASR-Aware End-to-end Neural Diarization ...
	Khare, Aparna; Han, Eunjung; Yang, Yuguang. - : arXiv, 2022
	BASE
	Show details

118	Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation ...
	Lam, Tsz Kin; Schamoni, Shigehiko; Riezler, Stefan. - : arXiv, 2022
	BASE
	Show details

119	Wavebender GAN: An architecture for phonetically meaningful speech manipulation ...
	Beck, Gustavo Teodoro Döhler; Wennberg, Ulme; Malisz, Zofia. - : arXiv, 2022
	BASE
	Show details

120	UK-South Korea Prosody Research Network ...
	Jeon, Hae-Sung. - : Open Science Framework, 2022
	BASE
	Show details

Page: 1 2 3 4 5 6 7 8 9 10...158

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern