DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 27

1
Domain-Adversarial Based Model with Phonological Knowledge for Cross-Lingual Speech Recognition
In: Electronics; Volume 10; Issue 24; Pages: 3172 (2021)
BASE
Show details
2
MISPRONUNCIATION DETECTION AND DIAGNOSIS IN MANDARIN ACCENTED ENGLISH SPEECH
In: Theses and Dissertations--Electrical and Computer Engineering (2020)
BASE
Show details
3
Articulation modelling of vowels in dysarthric and non-dysarthric speech
Albalkhi, Rahaf. - 2020
BASE
Show details
4
The Vowel System of Korebaju
In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02420035 ; INTERSPEECH 2019, Sep 2019, Graz, Austria. ⟨10.21437/Interspeech.2019-3210⟩ ; https://www.interspeech2019.org/ (2019)
BASE
Show details
5
The effects of L1 AP-initial boundary tones and laryngeal features in Korean adaptation of Japanese plosives followed by a H or L vowel
In: Glossa: a journal of general linguistics; Vol 4, No 1 (2019); 49 ; 2397-1835 (2019)
BASE
Show details
6
Articulatory representations to address acoustic variability in speech ...
Sivaraman, Ganesh. - : Digital Repository at the University of Maryland, 2017
BASE
Show details
7
Articulatory representations to address acoustic variability in speech
Abstract: The past decade has seen phenomenal improvement in the performance of Automatic Speech Recognition (ASR) systems. In spite of this vast improvement in performance, the state-of-the-art still lags significantly behind human speech recognition. Even though certain systems claim super-human performance, this performance often is sub-par across domains and across datasets. This gap is predominantly due to the lack of robustness against speech variability. Even clean speech is extremely variable due to a large number of factors such as voice characteristics, speaking style, speaking rate, accents, casualness, emotions and more. The goal of this thesis is to investigate the variability of speech from the perspective of speech production, put forth robust articulatory features to address this variability, and to incorporate these features in state-of-the-art ASR systems in the best way possible. ASR systems model speech as a sequence of distinctive phone units like beads on a string. Although phonemes are distinctive units in the cognitive domain, their physical realizations are extremely varied due to coarticulation and lenition which are commonly observed in conversational speech. The traditional approaches deal with this issue by performing di-, tri- or quin-phone based acoustic modeling but are insufficient to model longer contextual dependencies. Articulatory phonology analyzes speech as a constellation of coordinated articulatory gestures performed by the articulators in the vocal tract (lips, tongue tip, tongue body, jaw, glottis and velum). In this framework, acoustic variability is explained by the temporal overlap of gestures and their reduction in space. In order to analyze speech in terms of articulatory gestures, the gestures need to be estimated from the speech signal. The first part of the thesis focuses on a speaker independent acoustic-to-articulatory inversion system that was developed to estimate vocal tract constriction variables (TVs) from speech. The mapping from acoustics to TVs was learned from the multi-speaker X-ray Microbeam (XRMB) articulatory dataset. Constriction regions from TV trajectories were defined as articulatory gestures using articulatory kinematics. The speech inversion system combined with the TV kinematics based gesture annotation provided a system to estimate articulatory gestures from speech. The second part of this thesis deals with the analysis of the articulatory trajectories under different types of variability such as multiple speakers, speaking rate, and accents. It was observed that speaker variation degraded the performance of the speech inversion system. A Vocal Tract Length Normalization (VTLN) based speaker normalization technique was therefore developed to address the speaker variability in the acoustic and articulatory domains. The performance of speech inversion systems was analyzed on an articulatory dataset containing speaking rate variations to assess if the model was able to reliably predict the TVs in challenging coarticulatory scenarios. The performance of the speech inversion system was analyzed in cross accent and cross language scenarios through experiments on a Dutch and British English articulatory dataset. These experiments provide a quantitative measure of the robustness of the speech inversion systems to different speech variability. The final part of this thesis deals with the incorporation of articulatory features in state-of-the-art medium vocabulary ASR systems. A hybrid convolutional neural network (CNN) architecture was developed to fuse the acoustic and articulatory feature streams in an ASR system. ASR experiments were performed on the Wall Street Journal (WSJ) corpus. Several articulatory feature combinations were explored to determine the best feature combination. Cross-corpus evaluations were carried out to evaluate the WSJ trained ASR system on the TIMIT and another dataset containing speaking rate variability. Results showed that combining articulatory features with acoustic features through the hybrid CNN improved the performance of the ASR system in matched and mismatched evaluation conditions. The findings based on this dissertation indicate that articulatory representations extracted from acoustics can be used to address acoustic variability in speech observed due to speakers, accents, and speaking rates and further be used to improve the performance of Automatic Speech Recognition systems.
Keyword: Articulatory features; Articulatory phonology; Automatic Speech Recognition; Electrical engineering; Linguistics; Speaker adaptation; Speech inversion; Speech variability
URL: https://doi.org/10.13016/M2BK16R29
http://hdl.handle.net/1903/20422
BASE
Hide details
8
Stress Effects on Stop Bursts in Five Languages
In: Laboratory Phonology: Journal of the Association for Laboratory Phonology; Vol 7, No 1 (2016); 16 ; 1868-6354 (2016)
BASE
Show details
9
Stress Effects on Stop Bursts in Five Languages
Tabain, M; Breen, G; Butcher, Andrew Richard. - : Ubiquity Press, 2016
BASE
Show details
10
Exceptional nasal-stop inventories
BASE
Show details
11
Exceptional nasal-stop inventories ; Inventaris excepcionals d’oclusives nasals
In: Catalan Journal of Linguistics; Vol. 15 (2016): Les excepcions en fonologia; p. 67-100 (2016)
BASE
Show details
12
Place oppositions in English coronal obstruents: an ultrasound study
BASE
Show details
13
Palate-referenced Articulatory Features for Acoustic-to-Articulator Inversion
In: Speech Pathology and Audiology Faculty Research and Publications (2014)
BASE
Show details
14
An Emergent Approach to the Guttural Natural Class
In: Proceedings of the Annual Meetings on Phonology; Proceedings of the 2013 Annual Meeting on Phonology ; 2377-3324 (2014)
BASE
Show details
15
Multiview feature learning for speech recognition
BASE
Show details
16
Speech Bandwidth Extension Using Articulatory Features
Shin, Dongeek. - 2011
BASE
Show details
17
Cross-lingual automatic speech recognition using tandem features
Lal, Partha. - : The University of Edinburgh, 2011
BASE
Show details
18
Perception of initial obstruent voicing is influenced by gestural organization
In: ISSN: 0095-4470 ; EISSN: 1095-8576 ; Journal of Phonetics ; https://halshs.archives-ouvertes.fr/halshs-00683110 ; Journal of Phonetics, Elsevier, 2010, 38, pp.109-126 (2010)
BASE
Show details
19
Statistical Mapping between Articulatory Movements and Acoustic Spectrum Using a Gaussian Mixture Model
In: http://spalab.naist.jp/~tomoki/Tomoki/Journals/SPECOM-Mar-2008_ArtSpMap.pdf (2007)
BASE
Show details
20
Multilingual Articulatory Features for Speech Recognition
In: http://rave.ohiolink.edu/etdc/view?acc_num=wright1176169264 (2007)
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
27
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern