1 |
Influence of Highly Inflected Word Forms and Acoustic Background on the Robustness of Automatic Speech Recognition for Human–Computer Interaction
|
|
|
|
In: Mathematics; Volume 10; Issue 5; Pages: 711 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Discriminative feature modeling for statistical speech recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Cross-lingual acoustic modeling in upper sorbian - preliminary study
|
|
|
|
In: Fraunhofer IKTS (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Glottal Stops in Upper Sorbian: A Data-Driven Approach
|
|
|
|
In: Fraunhofer IKTS (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Estimating the Degree of Sleepiness by Integrating Articulatory Feature Knowledge in Raw Waveform Based CNNS ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Estimating the Degree of Sleepiness by Integrating Articulatory Feature Knowledge in Raw Waveform Based CNNS ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Dealing with linguistic mismatches for automatic speech recognition
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Speech recognition with probabilistic transcriptions and end-to-end systems using deep learning
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Phonetic Context Embeddings for DNN-HMM Phone Recognition
|
|
|
|
In: Interspeech 2016 ; https://hal.sorbonne-universite.fr/hal-02166078 ; Interspeech 2016, Sep 2016, SAN FRANCISCO, United States. pp.405-409, ⟨10.21437/Interspeech.2016-1036⟩ (2016)
|
|
BASE
|
|
Show details
|
|
11 |
Modeling of a rise-fall intonation pattern in the language of young Paris Speakers
|
|
|
|
In: Speech Prosody ; https://halshs.archives-ouvertes.fr/halshs-01069584 ; Speech Prosody, 2014, 7, pp.814-818 (2014)
|
|
BASE
|
|
Show details
|
|
12 |
Vers une modélisation acoustique de l'intonation des jeunes en région parisienne : une question de " proximité " ?
|
|
|
|
In: ISSN: 1661-8246 ; EISSN: 1661-8246 ; Nouveaux Cahiers de Linguistique Française ; https://halshs.archives-ouvertes.fr/halshs-01069593 ; Nouveaux Cahiers de Linguistique Française, Université de Genève, 2014, 31, pp.257-171 (2014)
|
|
BASE
|
|
Show details
|
|
13 |
Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages
|
|
|
|
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014) ; 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014) ; https://halshs.archives-ouvertes.fr/halshs-00980431 ; 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160 (2014)
|
|
BASE
|
|
Show details
|
|
14 |
Modélisation acoustico-phonétique de langues peu dotées : Études phonétiques et travaux de reconnaissance automatique en luxembourgois
|
|
|
|
In: Journées d'Etude sur la Parole ; https://hal.archives-ouvertes.fr/hal-01843399 ; Journées d'Etude sur la Parole, Jan 2014, Le Mans, France (2014)
|
|
BASE
|
|
Show details
|
|
15 |
Speech Alignment and Recognition Experiments for Luxembourgish
|
|
|
|
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; https://hal.archives-ouvertes.fr/hal-01134824 ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages, May 2014, Saint-Petersbourg, Russia. pp.53-60 ; http://www.mica.edu.vn/sltu2014/ (2014)
|
|
Abstract:
International audience ; Luxembourgish, embedded in a multilingual context on the divide between Romance and Germanic cultures, remains one of Europe’s under-described languages. In this paper, we propose to study acoustic similarities between Luxembourgish and major contact languages (German, French, English) with the help of automatic speech alignment and recognition systems. Experiments were run using monolingual acoustic models trained on German, French and English together with (i) “multilingual” models trained on pooled speech data from these three languages, or with (ii) native Luxembourgish acoustic models from 1200 hours of untranscribed Luxembourgish audio data using unsupervised methods. We investigated whether Luxembourgish was globally better represented by one of the individual languages, by the multilingual model or by the native (unsupervised) model. While German provides globally the best acoustic match for native Luxembourgish, detailed analyses reveal language-specific preferences, in particular English and Luxembourgish models are preferred on diphthongs. The first ASR results illustrate the accuracy of the various sets of supervised monolingual and multilingual models versus unsupervised Luxembourgish acoustic models. The ASR word error rate is progressively reduced from 60 to 25% on the development data set by unsupervised training of larger context-dependent models on increasing anounts of audio data.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; acoustic modeling; forced alignment; language similarity; languages in contact; large vocabulary speech recognition; Luxembourgish; multilingual models; under-resourced languages; unsupervised training
|
|
URL: https://hal.archives-ouvertes.fr/hal-01134824
|
|
BASE
|
|
Hide details
|
|
16 |
A First LVCSR System for Luxembourgish, a Low-Resourced European Language
|
|
|
|
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135103 ; Zygmunt Vetulani; Joseph Mariani. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.479-490, 2014, 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25--27, 2011, Revised Selected Papers, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_39⟩ (2014)
|
|
BASE
|
|
Show details
|
|
17 |
Impact of Video Modeling Techniques on Efficiency and Effectiveness of Clinical Voice Assessment
|
|
|
|
In: http://rave.ohiolink.edu/etdc/view?acc_num=miami1398686540 (2014)
|
|
BASE
|
|
Show details
|
|
18 |
Anger Recognition in Speech Using Acoustic and Linguistic Cues
|
|
: Elsevier, 2013
|
|
BASE
|
|
Show details
|
|
19 |
Detection of acoustic-phonetic landmarks in mismatched conditions using a biomimetic model of human auditory processing
|
|
|
|
In: http://www.isle.uiuc.edu/%7Esborys/king_coling12.pdf (2012)
|
|
BASE
|
|
Show details
|
|
20 |
Detection of acoustic-phonetic landmarks in mismatched conditions using a biomimetic model of human auditory processing
|
|
|
|
In: http://aclweb.org/anthology/C/C12/C12-2058.pdf (2012)
|
|
BASE
|
|
Show details
|
|
|
|