DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4
Hits 1 – 20 of 68

1
Influence of Highly Inflected Word Forms and Acoustic Background on the Robustness of Automatic Speech Recognition for Human–Computer Interaction
In: Mathematics; Volume 10; Issue 5; Pages: 711 (2022)
BASE
Show details
2
Discriminative feature modeling for statistical speech recognition ...
Tüske, Zoltán. - : RWTH Aachen University, 2021
BASE
Show details
3
Cross-lingual acoustic modeling in upper sorbian - preliminary study
In: Fraunhofer IKTS (2021)
BASE
Show details
4
Glottal Stops in Upper Sorbian: A Data-Driven Approach
In: Fraunhofer IKTS (2021)
BASE
Show details
5
Estimating the Degree of Sleepiness by Integrating Articulatory Feature Knowledge in Raw Waveform Based CNNS ...
BASE
Show details
6
Estimating the Degree of Sleepiness by Integrating Articulatory Feature Knowledge in Raw Waveform Based CNNS ...
BASE
Show details
7
Dealing with linguistic mismatches for automatic speech recognition
Yang, Xuesong. - 2019
BASE
Show details
8
Speech recognition with probabilistic transcriptions and end-to-end systems using deep learning
Das, Amit. - 2018
BASE
Show details
9
Phonetic Context Embeddings for DNN-HMM Phone Recognition
In: Interspeech 2016 ; https://hal.sorbonne-universite.fr/hal-02166078 ; Interspeech 2016, Sep 2016, SAN FRANCISCO, United States. pp.405-409, ⟨10.21437/Interspeech.2016-1036⟩ (2016)
Abstract: International audience ; This paper proposes an approach, named phonetic context embedding , to model phonetic context effects for deep neural network hidden Markov model (DNN-HMM) phone recognition. Phonetic context embeddings can be regarded as continuous and distributed vector representations of context-dependent phonetic units (e.g., triphones). In this work they are computed using neural networks. First, all phone labels are mapped into vectors of binary distinctive features (DFs, e.g., nasal/not-nasal). Then for each speech frame the corresponding DF vector is concatenated with DF vectors of previous and next frames and fed into a neural network that is trained to estimate the acoustic coefficients (e.g., MFCCs) of that frame. The values of the first hidden layer represent the embedding of the input DF vectors. Finally, the resulting embeddings are used as secondary task targets in a multi-task learning (MTL) setting when training the DNN that computes phone state posteriors. The approach allows to easily encode a much larger context than alternative MTL-based approaches. Results on TIMIT with a fully connected DNN shows phone error rate (PER) reductions from 22.4% to 21.0% and from 21.3% to 19.8% on the test core and the validation set respectively and lower PER than an alternative strong MTL approach.
Keyword: [SCCO.LING]Cognitive science/Linguistics; [SCCO]Cognitive science; acoustic modeling; Index Terms: embeddings; multi-task learning; speech recog- nition
URL: https://hal.sorbonne-universite.fr/hal-02166078
https://doi.org/10.21437/Interspeech.2016-1036
BASE
Hide details
10
Robust automatic speech recognition for children ...
Gurunath Shivakumar, Prashanth. - : University of Southern California Digital Library (USC.DL), 2015
BASE
Show details
11
Modeling of a rise-fall intonation pattern in the language of young Paris Speakers
In: Speech Prosody ; https://halshs.archives-ouvertes.fr/halshs-01069584 ; Speech Prosody, 2014, 7, pp.814-818 (2014)
BASE
Show details
12
Vers une modélisation acoustique de l'intonation des jeunes en région parisienne : une question de " proximité " ?
In: ISSN: 1661-8246 ; EISSN: 1661-8246 ; Nouveaux Cahiers de Linguistique Française ; https://halshs.archives-ouvertes.fr/halshs-01069593 ; Nouveaux Cahiers de Linguistique Française, Université de Genève, 2014, 31, pp.257-171 (2014)
BASE
Show details
13
Towards the automatic processing of Yongning Na (Sino-Tibetan): developing a 'light' acoustic model of the target language and testing 'heavyweight' models from five national languages
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014) ; 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014) ; https://halshs.archives-ouvertes.fr/halshs-00980431 ; 4th International Workshop on Spoken Language Technologies for Under-resourced Languages (SLTU 2014), May 2014, St Petersburg, Russia. pp.153-160 (2014)
BASE
Show details
14
Modélisation acoustico-phonétique de langues peu dotées : Études phonétiques et travaux de reconnaissance automatique en luxembourgois
In: Journées d'Etude sur la Parole ; https://hal.archives-ouvertes.fr/hal-01843399 ; Journées d'Etude sur la Parole, Jan 2014, Le Mans, France (2014)
BASE
Show details
15
Speech Alignment and Recognition Experiments for Luxembourgish
In: Proceedings of the 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages ; https://hal.archives-ouvertes.fr/hal-01134824 ; 4th International Workshop on Spoken Language Technologies for Underresourced Languages, May 2014, Saint-Petersbourg, Russia. pp.53-60 ; http://www.mica.edu.vn/sltu2014/ (2014)
BASE
Show details
16
A First LVCSR System for Luxembourgish, a Low-Resourced European Language
In: Human Language Technology Challenges for Computer Science and Linguistics ; https://hal.archives-ouvertes.fr/hal-01135103 ; Zygmunt Vetulani; Joseph Mariani. Human Language Technology Challenges for Computer Science and Linguistics, 8387, Springer International Publishing, pp.479-490, 2014, 5th Language and Technology Conference, LTC 2011, Poznań, Poland, November 25--27, 2011, Revised Selected Papers, 978-3-319-08957-7. ⟨10.1007/978-3-319-08958-4_39⟩ (2014)
BASE
Show details
17
Impact of Video Modeling Techniques on Efficiency and Effectiveness of Clinical Voice Assessment
In: http://rave.ohiolink.edu/etdc/view?acc_num=miami1398686540 (2014)
BASE
Show details
18
Anger Recognition in Speech Using Acoustic and Linguistic Cues
: Elsevier, 2013
BASE
Show details
19
Detection of acoustic-phonetic landmarks in mismatched conditions using a biomimetic model of human auditory processing
In: http://www.isle.uiuc.edu/%7Esborys/king_coling12.pdf (2012)
BASE
Show details
20
Detection of acoustic-phonetic landmarks in mismatched conditions using a biomimetic model of human auditory processing
In: http://aclweb.org/anthology/C/C12/C12-2058.pdf (2012)
BASE
Show details

Page: 1 2 3 4

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
68
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern