DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...80
Hits 1 – 20 of 1.597

1
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
BASE
Show details
2
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
BASE
Show details
3
Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding
In: ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-03578503 ; ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing, May 2022, Singapour, Singapore (2022)
Abstract: International audience ; This paper proposes a simple and effective approach for automatic recognition of Cued Speech (CS), a visual communication tool that helps people with hearing impairment to understand spoken language with the help of hand gestures that can uniquely identify the uttered phonemes in complement to lipreading. The proposed approach is based on a pre-trained hand and lips tracker used for visual feature extraction and a phonetic decoder based on a multistream recurrent neural network trained with connectionist temporal classification loss and combined with a pronunciation lexicon. The proposed system is evaluated on an updated version of the French CS dataset CSF18 for which the phonetic transcription has been manually checked and corrected. With a decoding accuracy at the phonetic level of 70.88%, the proposed system outperforms our previous CNN-HMM decoder and competes with more complex baselines.
Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; cued speech; hearing impairment; multi-modality; neural network; Visual speech
URL: https://hal.archives-ouvertes.fr/hal-03578503v2/file/Practical%20ACSR.pdf
https://hal.archives-ouvertes.fr/hal-03578503v2/document
https://hal.archives-ouvertes.fr/hal-03578503
BASE
Hide details
4
An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
In: ISSN: 2375-4699 ; EISSN: 2375-4702 ; ACM Transactions on Asian and Low-Resource Language Information Processing ; https://hal.inria.fr/hal-03616853 ; ACM Transactions on Asian and Low-Resource Language Information Processing, ACM, In press, ⟨10.1145/3523179⟩ (2022)
BASE
Show details
5
Etude de cas de pathologies de la parole dans le cadre de la prise en charge orthophonique
In: https://hal.archives-ouvertes.fr/hal-03568182 ; 2022 (2022)
BASE
Show details
6
Differentially private speaker anonymization
In: https://hal.inria.fr/hal-03588932 ; 2022 (2022)
BASE
Show details
7
Automatic assessment of oral readings of young pupils
In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.archives-ouvertes.fr/hal-03585934 ; Speech Communication, Elsevier : North-Holland, 2022, 138, pp.67-79. ⟨10.1016/j.specom.2022.01.008⟩ ; https://www.sciencedirect.com/science/article/pii/S0167639322000164?via%3Dihub (2022)
BASE
Show details
8
Unsupervised quantification of entity consistency between photos and text in real-world news ...
Müller-Budack, Eric. - : Hannover : Institutionelles Repositorium der Leibniz Universität Hannover, 2022
BASE
Show details
9
Danish Fungi 2020
Picek, Lukáš; Šulc, Milan; Matas, Jiří. - : IEEE/CVF, 2022
BASE
Show details
10
Principles of Learning in Multitask Settings: A Probabilistic Perspective ...
Al-shedivat, Maruan. - : Carnegie Mellon University, 2022
BASE
Show details
11
Principles of Learning in Multitask Settings: A Probabilistic Perspective ...
Al-shedivat, Maruan. - : Carnegie Mellon University, 2022
BASE
Show details
12
The 2021 NIST Speaker Recognition Evaluation ...
BASE
Show details
13
Cross-view Brain Decoding ...
BASE
Show details
14
Learning English with Peppa Pig ...
BASE
Show details
15
Who has ears, listen: Citizen Listening Program for disease prevention. ...
García Pereira, Ramiro. - : figshare, 2022
BASE
Show details
16
Who has ears, listen: Citizen Listening Program for disease prevention. ...
García Pereira, Ramiro. - : figshare, 2022
BASE
Show details
17
Segmentation of Glottal Images from High-Speed Videoendoscopy Optimized by Synchronous Acoustic Recordings
In: Sensors; Volume 22; Issue 5; Pages: 1751 (2022)
BASE
Show details
18
Connecting Text Classification with Image Classification: A New Preprocessing Method for Implicit Sentiment Text Classification
In: Sensors; Volume 22; Issue 5; Pages: 1899 (2022)
BASE
Show details
19
A Study of F0 Modification for X-Vector Based Speech Pseudonymization Across Gender
In: PPAI 2021 - The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence ; https://hal.archives-ouvertes.fr/hal-02995862 ; PPAI 2021 - The Second AAAI Workshop on Privacy-Preserving Artificial Intelligence, Feb 2021, Virtual, China (2021)
BASE
Show details
20
Assessment of adult speech disorders: current situation and needs in French-speaking clinical practice
In: ISSN: 1401-5439 ; Logopedics Phoniatrics Vocology ; https://hal.archives-ouvertes.fr/hal-03120115 ; Logopedics Phoniatrics Vocology, Taylor & Francis, 2021, pp.1-15. ⟨10.1080/14015439.2020.1870245⟩ (2021)
BASE
Show details

Page: 1 2 3 4 5...80

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
3
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
1.594
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern