DE eng

Search in the Catalogues and Directories

Hits 1 – 8 of 8

1
Dealing with linguistic mismatches for automatic speech recognition
Yang, Xuesong. - 2019
BASE
Show details
2
Semi-supervised learning for acoustic and prosodic modeling in speech applications
Huang, Jui Ting. - 2012
BASE
Show details
3
Beiträge zur statistischen Modellierung und effizienten Dekodierung in der automatischen Spracherkennung ; Contributions to statistical modeling and effecient decoding in automatic speech recognition
Willett, Daniel. - 2006
Abstract: The thesis deals with different aspects of automatic speech recognition. After an introduction, which describes the most important fundamental ideas, methodologies and algorithms, some new approaches are outlined and evaluated, which aim for the optimization of the acoustic modeling component of a speech recognition system. The target is the fine adjustment of the selected modeling structure to the quantity and type of the available acoustic training data. In experimental investigations on internationally known speech recognition tasks the presented new modeling scheme outperforms conventional systems by approximately 10% in recognition performance. In addition, the approach of tree-based clustering of context-dependent model states is extended in such a way that the specification of phonetic categories can be avoided. The recognition system clustered with the help of this procedure achieves a similar recognition performance as the best systems of the official evaluation of the Wall Street Journal large vocabulary recognition task with 5,000 words. Furthermore, discriminative training procedures for acoustic modeling are discussed and evaluated. The approach of vocabulary-based discriminative training is proposed and the extension to vocabulary- and language model-based training is outlined in detail. The experimental results prove the suitability of the approach for better parameter estimates in contrast to Maximum-Likelihood training and the conventional frame-based discriminative training. Additionally, new hybrid recognition systems with a discriminatively trained preprocessing are presented. The hybrid recognition system with context-depending modeling set up in the experiments with the Resource Management database achieves one of the best ever reported error rates obtained with comparable systems. In the following paragraph, the two most common forms of organizing the decoding procedure are presented and the contributions of the author within this area are presented and evaluated. Time-synchronous Viterbi-decoding with a tree-structured recognition network that makes use of partial tree copies and language model smearing proved to be a powerful and efficient decoding approach in case of a bi-gram language model. With the proposed A-Posteriori pruning and A-Posteriori-Lookahead pruning a further acceleration of the decoding can be achieved, which only causes a relatively small additional search error. Moreover, the principle of decoding with stacks is illustrated, which is of great advantage when making use of language models of higher context depth. The developed stack-decoder "DUcoder" is introduced. In evaluations, decoding with a 95,000 words vocabulary and a tri-gram language model in almost real-time is achieved. This, however, still comes along with a substantial search error. Finally, the German large vocabulary speech recognition system "DuDeutsch" developed by the author is presented. It allows the speaker-independent and the speaker-dependent recognition with a vocabulary of up to 95,000 words. For acoustic modeling the clustering and structure optimization procedures presented in the thesis are applied; decoding is performed with the presented stack-decoder. The speaker-dependent models are gained from the speaker-dependent ones using adaptation techniques. The proposed discriminative adaptation approach results in approximately 15% improved error reduction compared to the common Maximum-Likelihood approach.
Keyword: Acoustic Modeling; Adaptation; Automatic Speech Recognition; ddc:62; ddc:620; Discriminative Training; Effecient Decoding; Fakultät für Ingenieurwissenschaften » Elektrotechnik und Informationstechnik; Man-Machine-Communication
URL: https://duepublico2.uni-due.de/servlets/MCRZipServlet/duepublico_derivate_00005019
https://nbn-resolving.org/urn:nbn:de:hbz:464-20061017-105241-8
https://duepublico2.uni-due.de/receive/duepublico_mods_00005019
BASE
Hide details
4
Automatic Recognition of Cantonese-English Code-Mixing Speech
In: http://wing.comp.nus.edu.sg/~antho/O/O09/O09-5003.pdf
BASE
Show details
5
COMBINING SPEECH RECOGNITION AND ACOUSTIC WORD EMOTION MODELS FOR ROBUST TEXT-INDEPENDENT EMOTION RECOGNITION
In: http://www.mmk.ei.tum.de/publ/pdf/08/08sch9.pdf
BASE
Show details
6
Towards a non-parametric acoustic model: An acoustic decision tree for observation probability calculation,” Interspeech 2008
In: http://www.cs.cmu.edu/~ychiu/ychiu_web_files/nonparametric.pdf
BASE
Show details
7
INTEGRATION OF MULTIPLE FEATURE SETS FOR REDUCING AMBIGUITY IN ASR
In: http://www.ece.mcgill.ca/~rrose1/papers/rose_parya_icassp07.pdf
BASE
Show details
8
Towards a Non-Parametric Acoustic Model: An Acoustic Decision Tree for Observation Probability Calculation
In: http://research.microsoft.com/pubs/78716/ADT.pdf
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
8
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern