DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...8
Hits 1 – 20 of 154

1
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
BASE
Show details
2
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
BASE
Show details
3
Learning and controlling the source-filter representation of speech with a variational autoencoder
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
BASE
Show details
4
Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-02355669 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287541⟩ ; https://eusipco2020.org/ (2021)
BASE
Show details
5
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
Abstract: International audience ; Speaker counting is the task of estimating the number of people that are simultaneously speaking in an audio recording. For several audio processing tasks such as speaker diarization, separation, localization and tracking, knowing the number of speakers at each timestep is a prerequisite, or at least it can be a strong advantage, in addition to enabling a low latency processing. For that purpose, we address the speaker counting problem with a multichannel convolutional recurrent neural network which produces an estimation at a short-term frame resolution. We trained the network to predict up to 5 concurrent speakers in a multichannel mixture, with simulated data including many different conditions in terms of source and microphone positions, reverberation, and noise. The network can predict the number of speakers with good accuracy at frame resolution.
Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; CRNN; reverberation; Speaker counting
URL: https://doi.org/10.23919/Eusipco47968.2020.9287637
https://hal.archives-ouvertes.fr/hal-03537323
https://hal.archives-ouvertes.fr/hal-03537323/file/eusipco2020.pdf
https://hal.archives-ouvertes.fr/hal-03537323/document
BASE
Hide details
6
Automatic Speech Recognition systems errors for accident-prone sleepiness detection through voice
In: EUSIPCO 2021 ; https://hal.archives-ouvertes.fr/hal-03324033 ; EUSIPCO 2021, Aug 2021, Dublin (en ligne), Ireland. ⟨10.23919/EUSIPCO54536.2021.9616299⟩ (2021)
BASE
Show details
7
Automatic Speech Recognition systems errors for objective sleepiness detection through voice
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03328827 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.2476-2480, ⟨10.21437/Interspeech.2021-291⟩ (2021)
BASE
Show details
8
Speaker Attentive Speech Emotion Recognition
In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
BASE
Show details
9
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
BASE
Show details
10
Learning robust speech representation with an articulatory-regularized variational autoencoder
In: Proccedings of Interspeech 2021 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03373252 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
11
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
In: INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-03329245 ; INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
12
Learning spectro-temporal representations of complex sounds with parameterized neural networks
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-03329261 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (1), pp.353-366. ⟨10.1121/10.0005482⟩ (2021)
BASE
Show details
13
Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
Heba, Abdelwahab. - : HAL CCSD, 2021
In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
BASE
Show details
14
Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
Vaglio, Andrea. - : HAL CCSD, 2021
In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
BASE
Show details
15
Prosodic Disambiguation Using Chironomic Stylization of Intonation with Native and Non-Native Speakers
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329111 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.516-520, ⟨10.21437/Interspeech.2021-182⟩ (2021)
BASE
Show details
16
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery
In: https://hal.archives-ouvertes.fr/hal-03467205 ; 2021 (2021)
BASE
Show details
17
Vocal drum sounds in human beatboxing: An acoustic and articulatory exploration using electromagnetic articulography
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.univ-grenoble-alpes.fr/hal-03107358 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 149 (1), pp.191-206. ⟨10.1121/10.0002921⟩ ; https://asa.scitation.org/doi/full/10.1121/10.0002921 (2021)
BASE
Show details
18
Perceptual equivalence of the Liljencrants-Fant and linear-filter glottal flow models
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.archives-ouvertes.fr/hal-03322875 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (2), pp.1273-1285. ⟨10.1121/10.0005879⟩ ; https://doi.org/10.1121/10.0005879 (2021)
BASE
Show details
19
End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings
In: ISBN: 978-1-6654-0019-0 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII) ; https://hal.archives-ouvertes.fr/hal-03405970 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), Sep 2021, Nara, Japan ; https://www.acii-conf.net/2021/ (2021)
BASE
Show details
20
Automated audio captioning by fine-tuning bart with audioset tags
In: DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events ; https://hal.inria.fr/hal-03522488 ; DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2021, Virtual, Spain (2021)
BASE
Show details

Page: 1 2 3 4 5...8

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
154
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern