1 |
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
|
|
|
|
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
|
|
Abstract:
International audience ; The use of the mel spectrogram as a signal parameterization for voice generation is quite recent and linked to the development of neural vocoders. These are deep neural networks that allow reconstructing high-quality speech from a given mel spectrogram. While initially developed for speech synthesis, now neural vocoders have also been studied in the context of voice attribute manipulation, opening new means for voice processing in audio production. However, to be able to apply neural vocoders in real-world applications, two problems need to be addressed: (1) To support use in professional audio workstations, the computational complexity should be small, (2) the vocoder needs to support a large variety of speakers, differences in voice qualities, and a wide range of intensities potentially encountered during audio production. In this context, the present study will provide a detailed description of the Multi-band Excited WaveNet, a fully convolutional neural vocoder built around signal processing blocks. It will evaluate the performance of the vocoder when trained on a variety of multi-speaker and multi-singer databases, including an experimental evaluation of the neural vocoder trained on speech and singing voices. Addressing the problem of intensity variation, the study will introduce a new adaptive signal normalization scheme that allows for robust compensation for dynamic and static gain variations. Evaluations are performed using objective measures and a number of perceptual tests including different neural vocoder algorithms known from the literature. The results confirm that the proposed vocoder compares favorably to the state-of-the-art in its capacity to generalize to unseen voices and voice qualities. The remaining challenges will be discussed.
|
|
Keyword:
[INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing
|
|
URL: https://doi.org/10.3390/info13030103 https://hal.archives-ouvertes.fr/hal-03599076
|
|
BASE
|
|
Hide details
|
|
3 |
Etude de cas de pathologies de la parole dans le cadre de la prise en charge orthophonique
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03568182 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Automatic assessment of oral readings of young pupils
|
|
|
|
In: ISSN: 0167-6393 ; EISSN: 1872-7182 ; Speech Communication ; https://hal.archives-ouvertes.fr/hal-03585934 ; Speech Communication, Elsevier : North-Holland, 2022, 138, pp.67-79. ⟨10.1016/j.specom.2022.01.008⟩ ; https://www.sciencedirect.com/science/article/pii/S0167639322000164?via%3Dihub (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Automatic Speech Recognition systems errors for accident-prone sleepiness detection through voice
|
|
|
|
In: EUSIPCO 2021 ; https://hal.archives-ouvertes.fr/hal-03324033 ; EUSIPCO 2021, Aug 2021, Dublin (en ligne), Ireland. ⟨10.23919/EUSIPCO54536.2021.9616299⟩ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Automatic Speech Recognition systems errors for objective sleepiness detection through voice
|
|
|
|
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03328827 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.2476-2480, ⟨10.21437/Interspeech.2021-291⟩ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Introducing an experimental distortion-tolerant speech encryption scheme for secure voice communication
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03445994 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Spat~ : a comprehensive toolbox for sound spatialization in Max
|
|
|
|
In: ISSN: 2317-9694 ; Ideas Sonicas ; https://hal.archives-ouvertes.fr/hal-03356292 ; Ideas Sonicas, João Pedro Oliveira, 2021, Electroacoustic Space - Reflections - Tools for its design, 13 (24), pp.12 - 23 ; http://sonicideas.org (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Re-synchronization using the Hand Preceding Model for Multi-modal Fusion in Automatic Continuous Cued Speech Recognition
|
|
|
|
In: ISSN: 1520-9210 ; IEEE Transactions on Multimedia ; https://hal.archives-ouvertes.fr/hal-02433830 ; IEEE Transactions on Multimedia, Institute of Electrical and Electronics Engineers, 2021, 23, pp.292-305. ⟨10.1109/TMM.2020.2976493⟩ (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Human Beatbox: from extreme use of voice and speech to its use in speech therapy ; Le Human Beatbox : d’une utilisation extrême de la voix et de la parole à son utilité en orthophonie
|
|
|
|
In: ISSN: 0034-222X ; Rééducation orthophonique ; https://hal.archives-ouvertes.fr/hal-03377693 ; Rééducation orthophonique, Ortho édition, 2021, Rééducation orthophonique n°286 - Les phonations : sur la voie des voix, 286 ; https://www.orthoedition.com/revues/n-les-phonations-sur-la-voie-des-voix-4341.html (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Analyse objective de la parole dysarthrique : évaluation d’une sélection d’indices acoustiques
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03139503 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Automatic risk detection system by audiovisual signal processing ; Système de détection automatique de risques par traitement de signaux audiovisuels
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03602318 ; Signal and Image processing. Université Polytechnique Hauts-de-France; Institut national des sciences appliquées Hauts-de-France, 2021. English. ⟨NNT : 2021UPHF0040⟩ (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
|
|
BASE
|
|
Show details
|
|
14 |
A bio-inspired geometric model for sound reconstruction
|
|
|
|
In: ISSN: 2190-8567 ; Journal of Mathematical Neuroscience ; https://hal.archives-ouvertes.fr/hal-02531537 ; Journal of Mathematical Neuroscience, BioMed Central, 2021, 11 (1), pp.2. ⟨10.1186/s13408-020-00099-4⟩ (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Photogrammétrie appliquée au végétale : automatisation et post traitement
|
|
|
|
In: 16èmes Journées de la Mesure et de la Métrologie (J2M) ; https://hal.archives-ouvertes.fr/hal-03644832 ; 16èmes Journées de la Mesure et de la Métrologie (J2M), INRAE, Oct 2021, Ardes sur Couze, France ; http://www7.inra.fr/j2m/fichiers/recueils/j2m_2021.pdf (2021)
|
|
BASE
|
|
Show details
|
|
16 |
La voce umana, dal respiro al canto
|
|
|
|
In: ISSN: 2611-5689 ; Bollettino del Laboratorio di Fonetica Sperimentale "Arturo Genre" ; https://hal.archives-ouvertes.fr/hal-03508030 ; Bollettino del Laboratorio di Fonetica Sperimentale "Arturo Genre", Universita di Torino, 2021, https://www.lfsag.unito.it/ricerca/phonews/07/7_3.pdf ; https://www.lfsag.unito.it/ricerca/phonews/index.html (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Optimization of Dental Devices and Tools used on Teeth
|
|
|
|
In: BioMed Research International ; https://hal.archives-ouvertes.fr/hal-03253408 ; BioMed Research International, In press, pp.9913788. ⟨10.1155/2021/9913788⟩ (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Learning emotions latent representation with CVAE for Text-Driven Expressive AudioVisual Speech Synthesis
|
|
|
|
In: ISSN: 0893-6080 ; Neural Networks ; https://hal.inria.fr/hal-03204193 ; Neural Networks, Elsevier, 2021, 141, pp.315-329. ⟨10.1016/j.neunet.2021.04.021⟩ (2021)
|
|
BASE
|
|
Show details
|
|
19 |
User-friendly automatic transcription of low-resource languages: Plugging ESPnet into Elpis
|
|
|
|
In: ComputEL-4: Fourth Workshop on the Use of Computational Methods in the Study of Endangered Languages ; https://halshs.archives-ouvertes.fr/halshs-03030529 ; ComputEL-4: Fourth Workshop on the Use of Computational Methods in the Study of Endangered Languages, Mar 2021, Hawai‘i, United States (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Humming beatboxing : the vocal orchestra within
|
|
|
|
In: MAVEBA 2021 - 12th International Workshop Models and Analysis of Vocal Emissions for Biomedical Applications ; https://hal.archives-ouvertes.fr/hal-03510719 ; MAVEBA 2021 - 12th International Workshop Models and Analysis of Vocal Emissions for Biomedical Applications, Universita Degli Studi Firenze, Dec 2021, Florence, Italy ; http://maveba.dinfo.unifi.it (2021)
|
|
BASE
|
|
Show details
|
|
|
|