DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6 7 8 9 10...158
Hits 101 – 120 of 3.146

101
Task and language in Spanish–English narratives (Wofford et al., 2022) ...
BASE
Show details
102
Task and language in Spanish–English narratives (Wofford et al., 2022) ...
BASE
Show details
103
Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0 ...
Abstract: Stuttering is a varied speech disorder that harms an individual's communication ability. Persons who stutter (PWS) often use speech therapy to cope with their condition. Improving speech recognition systems for people with such non-typical speech or tracking the effectiveness of speech therapy would require systems that can detect dysfluencies while at the same time being able to detect speech techniques acquired in therapy. This paper shows that fine-tuning wav2vec 2.0 for the classification of stuttering on a sizeable English corpus containing stuttered speech, in conjunction with multi-task learning, boosts the effectiveness of the general-purpose wav2vec 2.0 features for detecting stuttering in speech; both within and across languages. We evaluate our method on Fluencybank and the German therapy-centric Kassel State of Fluency (KSoF) dataset by training Support Vector Machine classifiers using features extracted from the fine-tuned models for six different stuttering-related events types: blocks, ... : Submitted to Interspeech 2022 ...
Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering
URL: https://dx.doi.org/10.48550/arxiv.2204.03417
https://arxiv.org/abs/2204.03417
BASE
Hide details
104
Multi-sequence Intermediate Conditioning for CTC-based ASR ...
BASE
Show details
105
Code Switched and Code Mixed Speech Recognition for Indic languages ...
BASE
Show details
106
Simple and Effective Unsupervised Speech Synthesis ...
BASE
Show details
107
Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding ...
BASE
Show details
108
Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech ...
Zhang, Cong; Zeng, Huinan; Liu, Huang. - : arXiv, 2022
BASE
Show details
109
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling ...
Peng, Puyuan; Harwath, David. - : arXiv, 2022
BASE
Show details
110
CALM: Contrastive Aligned Audio-Language Multirate and Multimodal Representations ...
BASE
Show details
111
Enhance Language Identification using Dual-mode Model with Knowledge Distillation ...
BASE
Show details
112
MAESTRO: Matched Speech Text Representations through Modality Matching ...
BASE
Show details
113
Improving Language Identification of Accented Speech ...
Kukk, Kunnar; Alumäe, Tanel. - : arXiv, 2022
BASE
Show details
114
Cross-stitched Multi-modal Encoders ...
BASE
Show details
115
Improving Non-native Word-level Pronunciation Scoring with Phone-level Mixup Data Augmentation and Multi-source Information ...
Fu, Kaiqi; Gao, Shaojun; Wang, Kai. - : arXiv, 2022
BASE
Show details
116
Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems ...
BASE
Show details
117
ASR-Aware End-to-end Neural Diarization ...
BASE
Show details
118
Sample, Translate, Recombine: Leveraging Audio Alignments for Data Augmentation in End-to-end Speech Translation ...
BASE
Show details
119
Wavebender GAN: An architecture for phonetically meaningful speech manipulation ...
BASE
Show details
120
UK-South Korea Prosody Research Network ...
Jeon, Hae-Sung. - : Open Science Framework, 2022
BASE
Show details

Page: 1 2 3 4 5 6 7 8 9 10...158

Catalogues
92
0
61
0
0
7
9
Bibliographies
389
0
0
0
0
0
45
4
15
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2.676
1
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern