1 |
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Learning and controlling the source-filter representation of speech with a variational autoencoder
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
|
|
|
|
In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-02355669 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287541⟩ ; https://eusipco2020.org/ (2021)
|
|
BASE
|
|
Show details
|
|
5 |
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
|
|
|
|
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Automatic Speech Recognition systems errors for accident-prone sleepiness detection through voice
|
|
|
|
In: EUSIPCO 2021 ; https://hal.archives-ouvertes.fr/hal-03324033 ; EUSIPCO 2021, Aug 2021, Dublin (en ligne), Ireland. ⟨10.23919/EUSIPCO54536.2021.9616299⟩ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Automatic Speech Recognition systems errors for objective sleepiness detection through voice
|
|
|
|
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03328827 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.2476-2480, ⟨10.21437/Interspeech.2021-291⟩ (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Speaker Attentive Speech Emotion Recognition
|
|
|
|
In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
|
|
|
|
In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Learning robust speech representation with an articulatory-regularized variational autoencoder
|
|
|
|
In: Proccedings of Interspeech 2021 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03373252 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
|
|
|
|
In: INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-03329245 ; INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Learning spectro-temporal representations of complex sounds with parameterized neural networks
|
|
|
|
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-03329261 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (1), pp.353-366. ⟨10.1121/10.0005482⟩ (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
|
|
BASE
|
|
Show details
|
|
14 |
Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Prosodic Disambiguation Using Chironomic Stylization of Intonation with Native and Non-Native Speakers
|
|
|
|
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329111 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.516-520, ⟨10.21437/Interspeech.2021-182⟩ (2021)
|
|
BASE
|
|
Show details
|
|
16 |
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03467205 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Vocal drum sounds in human beatboxing: An acoustic and articulatory exploration using electromagnetic articulography
|
|
|
|
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.univ-grenoble-alpes.fr/hal-03107358 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 149 (1), pp.191-206. ⟨10.1121/10.0002921⟩ ; https://asa.scitation.org/doi/full/10.1121/10.0002921 (2021)
|
|
Abstract:
International audience ; Acoustic characteristics, lingual and labial articulatory dynamics, and ventilatory behaviors were studied on abeatboxer producing twelve drum sounds belonging to five main categories of his repertoire (kick, snare, hi-hat, rim-shot, cymbal). Various types of experimental data were collected synchronously (respiratory inductance plethysmography, electroglottography, electromagnetic articulography, and acoustic recording). Automatic unsupervised classification was successfully applied on acoustic data with t-SNE spectral clustering technique. A cluster purity value of 94% was achieved, showing that each sound has a specific acoustic signature. Acoustical intensity of sounds produced with the humming technique was found to be significantly lower than their non-humming counterparts. For these sounds, a dissociation between articulation and breathing was observed. Overall, a wide range of articulatory gestures was observed, some of which were non-linguistic. The tongue was systematically involved in the articulation of the explored beatboxing sounds, either as the main articulator or as accompanying the lip dynamics. Two pulmonic and three non-pulmonic airstream mechanisms were identified. Ejectives were found in the production of all the sounds with bilabial occlusion or alveolar occlusion with egressive airstream. A phonetic annotation using the IPA alphabet was performed, highlighting the complexity of such sound production and the limits of speech-based annotation.
|
|
Keyword:
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [SHS.MUSIQ]Humanities and Social Sciences/Musicology and performing arts; [SPI.ACOU]Engineering Sciences [physics]/Acoustics [physics.class-ph]; [SPI.MECA.BIOM]Engineering Sciences [physics]/Mechanics [physics.med-ph]/Biomechanics [physics.med-ph]; Acoustics; Articulator; Larynx; Medical diagnosis; Phonetics; Sound production technology; Speech communication
|
|
URL: https://hal.univ-grenoble-alpes.fr/hal-03107358 https://hal.univ-grenoble-alpes.fr/hal-03107358/document https://hal.univ-grenoble-alpes.fr/hal-03107358/file/Paroni_JASA_2021.pdf https://doi.org/10.1121/10.0002921
|
|
BASE
|
|
Hide details
|
|
18 |
Perceptual equivalence of the Liljencrants-Fant and linear-filter glottal flow models
|
|
|
|
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.archives-ouvertes.fr/hal-03322875 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (2), pp.1273-1285. ⟨10.1121/10.0005879⟩ ; https://doi.org/10.1121/10.0005879 (2021)
|
|
BASE
|
|
Show details
|
|
19 |
End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings
|
|
|
|
In: ISBN: 978-1-6654-0019-0 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII) ; https://hal.archives-ouvertes.fr/hal-03405970 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), Sep 2021, Nara, Japan ; https://www.acii-conf.net/2021/ (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Automated audio captioning by fine-tuning bart with audioset tags
|
|
|
|
In: DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events ; https://hal.inria.fr/hal-03522488 ; DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2021, Virtual, Spain (2021)
|
|
BASE
|
|
Show details
|
|
|
|