Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 8 of 8

1	Dealing with linguistic mismatches for automatic speech recognition
	Yang, Xuesong. - 2019
	Abstract: Recent breakthroughs in automatic speech recognition (ASR) have resulted in a word error rate (WER) on par with human transcribers on the English Switchboard benchmark. However, dealing with linguistic mismatches between the training and testing data is still a significant challenge that remains unsolved. Under the monolingual environment, it is well-known that the performance of ASR systems degrades significantly when presented with the speech from speakers with different accents, dialects, and speaking styles than those encountered during system training. Under the multi-lingual environment, ASR systems trained on a source language achieve even worse performance when tested on another target language because of mismatches in terms of the number of phonemes, lexical ambiguity, and power of phonotactic constraints provided by phone-level n-grams. In order to address the issues of linguistic mismatches for current ASR systems, my dissertation investigates both knowledge-gnostic and knowledge-agnostic solutions. In the first part, classic theories relevant to acoustics and articulatory phonetics that present capability of being transferred across a dialect continuum from local dialects to another standardized language are re-visited. Experiments demonstrate the potentials that acoustic correlates in the vicinity of landmarks could help to build a bridge for dealing with mismatches across difference local or global varieties in a dialect continuum. In the second part, we design an end-to-end acoustic modeling approach based on connectionist temporal classification loss and propose to link the training of acoustics and accent altogether in a manner similar to the learning process in human speech perception. This joint model not only performed well on ASR with multiple accents but also boosted accuracies of accent identification task in comparison to separately-trained models.
	Keyword: Acoustic Landmarks; Acoustic Modeling; Acoustic Phonetics; Automatic Speech Recognition; Connectionist Temporal Classification; Deep Learning; Distinctive Features; End-to-End; Model Compression; Multi-Accents; Multi-Lingual; Multi-Task Learning; Pronunciation Error Detection
	URL: http://hdl.handle.net/2142/105187
	BASE
	Hide details

2	Semi-supervised learning for acoustic and prosodic modeling in speech applications
	Huang, Jui Ting. - 2012
	BASE
	Show details

3	Beiträge zur statistischen Modellierung und effizienten Dekodierung in der automatischen Spracherkennung ; Contributions to statistical modeling and effecient decoding in automatic speech recognition
	Willett, Daniel. - 2006
	BASE
	Show details

4	Automatic Recognition of Cantonese-English Code-Mixing Speech
	Joyce Y. C. Chan; Houwei Cao; P. C. Ching...
	In: http://wing.comp.nus.edu.sg/~antho/O/O09/O09-5003.pdf
	BASE
	Show details

5	COMBINING SPEECH RECOGNITION AND ACOUSTIC WORD EMOTION MODELS FOR ROBUST TEXT-INDEPENDENT EMOTION RECOGNITION
	Björn Schuller; Bogdan Vlasenko; Dejan Arsic...
	In: http://www.mmk.ei.tum.de/publ/pdf/08/08sch9.pdf
	BASE
	Show details

6	Towards a non-parametric acoustic model: An acoustic decision tree for observation probability calculation,” Interspeech 2008
	Michael L. Seltzer; Alex Acero
	In: http://www.cs.cmu.edu/~ychiu/ychiu_web_files/nonparametric.pdf
	BASE
	Show details

7	INTEGRATION OF MULTIPLE FEATURE SETS FOR REDUCING AMBIGUITY IN ASR
	Richard Rose
	In: http://www.ece.mcgill.ca/~rrose1/papers/rose_parya_icassp07.pdf
	BASE
	Show details

8	Towards a Non-Parametric Acoustic Model: An Acoustic Decision Tree for Observation Probability Calculation
	Michael L. Seltzer; Alex Acero; Yu-hsiang Bosco Chiu
	In: http://research.microsoft.com/pubs/78716/ADT.pdf
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern