DE eng

Search in the Catalogues and Directories

Hits 1 – 1 of 1

1
Dealing with distant relationships in natural language modelling for automatic speech recognition
In: 4th World Multiconference on Systemics, Cybernetics & Informatics - SCI'2000 ; https://hal.inria.fr/inria-00099031 ; 4th World Multiconference on Systemics, Cybernetics & Informatics - SCI'2000, International Institute of Informatics & Systemics, 2000, Orlando, USA, pp.400-405 (2000)
Abstract: Colloque avec actes et comité de lecture. internationale. ; International audience ; Classical statistical language models, called n-gram models, describe natural language using the probabilistic relationship between a word to predict and the n-1 contiguous words preceding it. Obviously, the linguistic relationships present in a sentence are more complex. A first remark is that there exist distant relationships. We present here some recent work on an alternative model to n-gram models, based on the split of the history, dealing with the interpolation between distant bigram models. More precisely, our model is a cheaper alternative to high order n-grams. In conventional n-grams, when n is greater than 3, events are less frequent and statistics are not reliable. To deal with this problem, and to accurately estimate parameters, we combine a smoothed bigram with distant 3-bigram, distant 4-bigram and a cache composed of 100 words. We present new progresses obtained by using a simulated annealing algorithm in order to calculate the best parameters of this linear combination. With a 20K vocabulary and 40 million words for training, our algorithm improved the perplexity by 5.4% in comparison with the Baum-Welch algorithm. Moreover, this new model outperforms a smoothed bigram by 6.1% in terms of perplexity.
Keyword: [INFO.INFO-OH]Computer Science [cs]/Other [cs.OH]; automatic speech recognition; bigrammes distants; distant bigram; modèles de langage stochastiques; reconnaissance automatique de la parole; recuit simulé; simulated annealing; stochastic language modelling
URL: https://hal.inria.fr/inria-00099031
https://hal.inria.fr/inria-00099031/file/A00-R-144.pdf
https://hal.inria.fr/inria-00099031/document
BASE
Hide details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
1
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern