DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...8
Hits 1 – 20 of 154

1
A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
BASE
Show details
2
Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
BASE
Show details
3
Learning and controlling the source-filter representation of speech with a variational autoencoder
In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
BASE
Show details
4
Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-02355669 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287541⟩ ; https://eusipco2020.org/ (2021)
BASE
Show details
5
High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
BASE
Show details
6
Automatic Speech Recognition systems errors for accident-prone sleepiness detection through voice
In: EUSIPCO 2021 ; https://hal.archives-ouvertes.fr/hal-03324033 ; EUSIPCO 2021, Aug 2021, Dublin (en ligne), Ireland. ⟨10.23919/EUSIPCO54536.2021.9616299⟩ (2021)
BASE
Show details
7
Automatic Speech Recognition systems errors for objective sleepiness detection through voice
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03328827 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.2476-2480, ⟨10.21437/Interspeech.2021-291⟩ (2021)
BASE
Show details
8
Speaker Attentive Speech Emotion Recognition
In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
BASE
Show details
9
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
BASE
Show details
10
Learning robust speech representation with an articulatory-regularized variational autoencoder
In: Proccedings of Interspeech 2021 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03373252 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
11
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
In: INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-03329245 ; INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
12
Learning spectro-temporal representations of complex sounds with parameterized neural networks
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-03329261 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (1), pp.353-366. ⟨10.1121/10.0005482⟩ (2021)
BASE
Show details
13
Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
Heba, Abdelwahab. - : HAL CCSD, 2021
In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
BASE
Show details
14
Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
Vaglio, Andrea. - : HAL CCSD, 2021
In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
Abstract: Lyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio. ; Les paroles de chansons fournissent un grand nombre d’informations sur la musique car ellescontiennent une grande partie de la sémantique des chansons. Ces informations pourraient aider les utilisateurs à naviguer facilement dans une large collection de chansons et permettre de leur offrir des recommandations personnalisées. Cependant, ces informations ne sont souvent pas disponibles sous leur forme textuelle. Les systèmes de reconnaissance de la voix chantée pourraient être utilisés pour obtenir des transcriptions directement à partir de la source audio. Ces approches sont usuellement adaptées de celles de la reconnaissance vocale. La transcription de la parole est un domaine vieux de plusieurs décennies qui a récemment connu des avancées significatives en raison des derniers développements des techniques d’apprentissage automatique. Cependant, appliqués au chant, ces algorithmes donnent des résultats peu satisfaisants et le processus de transcription des paroles reste difficile avec des complications particulières. Dans cette thèse, nous étudions plusieurs problèmes de ’Music Information Retrieval’ scientifiquement et industriellement complexes en utilisant des informations sur les paroles générées directement à partir de l’audio. L’accent est mis sur la nécessité de rendre les approches aussi pertinentes que possible dans le monde réel. Cela implique par exemple de les tester sur des ensembles de données vastes et diversifiés et d’étudier leur extensibilité. A cette fin, nous utilisons un large ensemble de données publiques possédant des annotations vocales et adaptons avec succès plusieurs des algorithmes de reconnaissance de paroles les plus performants. Nous présentons notamment, pour la première fois, un système qui détecte le contenu explicite directement à partir de l’audio. Les premières recherches sur la création d’un système d’alignement paroles audio multilingue sont également décrites. L’étude de la tâche alignement paroles-audio est complétée de deux expériences quantifiant la perception de la synchronisation de l’audio et des paroles. Une nouvelle approche phonotactique pour l’identification de la langue est également présentée. Enfin, nous proposons le premier algorithme de détection de versions employant explicitement les informations sur les paroles extraites de l’audio.
Keyword: [INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing; Alignement des paroles et de l'audio; Cover song detection; Détection de contenu explicite; Détection de reprise; Explicit content detection; Identification du langage; Language identification; Lyrics-To-Audio alignment; Reconnaissance de la voix chantée; Singing voice recognition
URL: https://tel.archives-ouvertes.fr/tel-03558515
https://tel.archives-ouvertes.fr/tel-03558515/file/99637_VAGLIO_2021_archivage.pdf
https://tel.archives-ouvertes.fr/tel-03558515/document
BASE
Hide details
15
Prosodic Disambiguation Using Chironomic Stylization of Intonation with Native and Non-Native Speakers
In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329111 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.516-520, ⟨10.21437/Interspeech.2021-182⟩ (2021)
BASE
Show details
16
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery
In: https://hal.archives-ouvertes.fr/hal-03467205 ; 2021 (2021)
BASE
Show details
17
Vocal drum sounds in human beatboxing: An acoustic and articulatory exploration using electromagnetic articulography
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.univ-grenoble-alpes.fr/hal-03107358 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 149 (1), pp.191-206. ⟨10.1121/10.0002921⟩ ; https://asa.scitation.org/doi/full/10.1121/10.0002921 (2021)
BASE
Show details
18
Perceptual equivalence of the Liljencrants-Fant and linear-filter glottal flow models
In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.archives-ouvertes.fr/hal-03322875 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (2), pp.1273-1285. ⟨10.1121/10.0005879⟩ ; https://doi.org/10.1121/10.0005879 (2021)
BASE
Show details
19
End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings
In: ISBN: 978-1-6654-0019-0 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII) ; https://hal.archives-ouvertes.fr/hal-03405970 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), Sep 2021, Nara, Japan ; https://www.acii-conf.net/2021/ (2021)
BASE
Show details
20
Automated audio captioning by fine-tuning bart with audioset tags
In: DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events ; https://hal.inria.fr/hal-03522488 ; DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2021, Virtual, Spain (2021)
BASE
Show details

Page: 1 2 3 4 5...8

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
154
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern