Home Catalogue search

eng

Refine your search:
- Keyword
- Creator / Publisher
- Year:
  - 2022 (3)
  - 2021 (21)
  - 2020 (17)
  - 2019 (6)
  - 2018 (13)
  - 2017 (7)
  - 2016 (7)
  - 2015 (7)
  - 2014 (7)
  - 2013 (5)
  - more
- Medium:
  - Online (154)
- Type
- BLLDB-Access:
  - free (154)
  - subject to license (0)

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...8

Hits 1 – 20 of 154

1	A Bottleneck Auto-Encoder for F0 Transformations on Speech and Singing Voice
	Bous, Frederik; Roebel, Axel
	In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599085 ; Information, MDPI, 2022, 13 (3), pp.102. ⟨10.3390/info13030102⟩ (2022)
	BASE
	Show details

2	Neural Vocoding for Singing and Speaking Voices with the Multi-Band Excited WaveNet
	Roebel, Axel; Bous, Frederik
	In: ISSN: 2078-2489 ; Information ; https://hal.archives-ouvertes.fr/hal-03599076 ; Information, MDPI, 2022, 13 (3), pp.103. ⟨10.3390/info13030103⟩ (2022)
	BASE
	Show details

3	Learning and controlling the source-filter representation of speech with a variational autoencoder
	Sadok, Samir; Leglaive, Simon; Girin, Laurent...
	In: https://hal.archives-ouvertes.fr/hal-03650569 ; 2022 (2022)
	BASE
	Show details

4	Analyzing the impact of speaker localization errors on speech separation for automatic speech recognition
	Sivasankaran, Sunit; Vincent, Emmanuel; Fohr, Dominique
	In: EUSIPCO 2020 - 28th European Signal Processing Conference ; https://hal.inria.fr/hal-02355669 ; EUSIPCO 2020 - 28th European Signal Processing Conference, Jan 2021, Amsterdam / Virtual, Netherlands. ⟨10.23919/Eusipco47968.2020.9287541⟩ ; https://eusipco2020.org/ (2021)
	BASE
	Show details

5	High-resolution speaker counting in reverberant rooms using CRNN with Ambisonics features
	Grumiaux, Pierre-Amaury; Kitic, Srdan; Girin, Laurent...
	In: EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO) ; https://hal.archives-ouvertes.fr/hal-03537323 ; EUSIPCO 2020 - 28th European Signal Processing Conference (EUSIPCO), Jan 2021, Amsterdam, Netherlands. pp.71-75, ⟨10.23919/Eusipco47968.2020.9287637⟩ (2021)
	BASE
	Show details

6	Automatic Speech Recognition systems errors for accident-prone sleepiness detection through voice
	Martin, Vincent,; Rouas, Jean-Luc; Boyer, Florian...
	In: EUSIPCO 2021 ; https://hal.archives-ouvertes.fr/hal-03324033 ; EUSIPCO 2021, Aug 2021, Dublin (en ligne), Ireland. ⟨10.23919/EUSIPCO54536.2021.9616299⟩ (2021)
	BASE
	Show details

7	Automatic Speech Recognition systems errors for objective sleepiness detection through voice
	Martin, Vincent,; Rouas, Jean-Luc; Boyer, Florian...
	In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03328827 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.2476-2480, ⟨10.21437/Interspeech.2021-291⟩ (2021)
	BASE
	Show details

8	Speaker Attentive Speech Emotion Recognition
	Le Moine, Clément; Obin, Nicolas; Roebel, Axel
	In: Proccedings of interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03554368 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.2866-2870, ⟨10.21437/interspeech.2021-573⟩ (2021)
	BASE
	Show details

9	Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
	Trang, Nguyen Thi Thu; Ky, Nguyen,; Rilliard, Albert...
	In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
	BASE
	Show details

10	Learning robust speech representation with an articulatory-regularized variational autoencoder
	Georges, Marc-Antoine; Girin, Laurent; Schwartz, Jean-Luc...
	In: Proccedings of Interspeech 2021 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03373252 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
	BASE
	Show details

11	Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
	Polyak, Adam; Adi, Yossi; Copet, Jade...
	In: INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-03329245 ; INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
	BASE
	Show details

12	Learning spectro-temporal representations of complex sounds with parameterized neural networks
	Riad, Rachid; Karadayi, Julien; Bachoud-Lévi, Anne-Catherine...
	In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.inria.fr/hal-03329261 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (1), pp.353-366. ⟨10.1121/10.0005482⟩ (2021)
	BASE
	Show details

13	Large vocabulary automatic speech recognition: from hybrid to end-to-end approaches ; Reconnaissance automatique de la parole à large vocabulaire : des approches hybrides aux approches End-to-End
	Heba, Abdelwahab. - : HAL CCSD, 2021
	In: https://hal.archives-ouvertes.fr/tel-03269807 ; Son [cs.SD]. Université toulouse 3 Paul Sabatier, 2021. Français (2021)
	BASE
	Show details

14	Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
	Vaglio, Andrea. - : HAL CCSD, 2021
	In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
	Abstract: Lyrics provide a lot of information about music since they encapsulate a lot of the semantics of songs. Such information could help users navigate easily through a large collection of songs and to recommend new music to them. However, this information is often unavailable in its textual form. To get around this problem, singing voice recognition systems could be used to obtain transcripts directly from the audio. These approaches are generally adapted from the speech recognition ones. Speech transcription is a decades-old domain that has lately seen significant advancements due to developments in machine learning techniques. When applied to the singing voice, however, these algorithms provide poor results. For a number of reasons, the process of lyrics transcription remains difficult. In this thesis, we investigate several scientifically and industrially difficult ’Music Information Retrieval’ problems by utilizing lyrics information generated straight from audio. The emphasis is on making approaches as relevant in real-world settings as possible. This entails testing them on vast and diverse datasets and investigating their scalability. To do so, a huge publicly available annotated lyrics dataset is used, and several state-of-the-art lyrics recognition algorithms are successfully adapted. We notably present, for the first time, a system that detects explicit content directly from audio. The first research on the creation of a multilingual lyrics-toaudio system are as well described. The lyrics-toaudio alignment task is further studied in two experiments quantifying the perception of audio and lyrics synchronization. A novel phonotactic method for language identification is also presented. Finally, we provide the first cover song detection algorithm that makes explicit use of lyrics information extracted from audio. ; Les paroles de chansons fournissent un grand nombre d’informations sur la musique car ellescontiennent une grande partie de la sémantique des chansons. Ces informations pourraient aider les utilisateurs à naviguer facilement dans une large collection de chansons et permettre de leur offrir des recommandations personnalisées. Cependant, ces informations ne sont souvent pas disponibles sous leur forme textuelle. Les systèmes de reconnaissance de la voix chantée pourraient être utilisés pour obtenir des transcriptions directement à partir de la source audio. Ces approches sont usuellement adaptées de celles de la reconnaissance vocale. La transcription de la parole est un domaine vieux de plusieurs décennies qui a récemment connu des avancées significatives en raison des derniers développements des techniques d’apprentissage automatique. Cependant, appliqués au chant, ces algorithmes donnent des résultats peu satisfaisants et le processus de transcription des paroles reste difficile avec des complications particulières. Dans cette thèse, nous étudions plusieurs problèmes de ’Music Information Retrieval’ scientifiquement et industriellement complexes en utilisant des informations sur les paroles générées directement à partir de l’audio. L’accent est mis sur la nécessité de rendre les approches aussi pertinentes que possible dans le monde réel. Cela implique par exemple de les tester sur des ensembles de données vastes et diversifiés et d’étudier leur extensibilité. A cette fin, nous utilisons un large ensemble de données publiques possédant des annotations vocales et adaptons avec succès plusieurs des algorithmes de reconnaissance de paroles les plus performants. Nous présentons notamment, pour la première fois, un système qui détecte le contenu explicite directement à partir de l’audio. Les premières recherches sur la création d’un système d’alignement paroles audio multilingue sont également décrites. L’étude de la tâche alignement paroles-audio est complétée de deux expériences quantifiant la perception de la synchronisation de l’audio et des paroles. Une nouvelle approche phonotactique pour l’identification de la langue est également présentée. Enfin, nous proposons le premier algorithme de détection de versions employant explicitement les informations sur les paroles extraites de l’audio.
	Keyword: [INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SPI.SIGNAL]Engineering Sciences [physics]/Signal and Image processing; Alignement des paroles et de l'audio; Cover song detection; Détection de contenu explicite; Détection de reprise; Explicit content detection; Identification du langage; Language identification; Lyrics-To-Audio alignment; Reconnaissance de la voix chantée; Singing voice recognition
	URL: https://tel.archives-ouvertes.fr/tel-03558515 https://tel.archives-ouvertes.fr/tel-03558515/file/99637_VAGLIO_2021_archivage.pdf https://tel.archives-ouvertes.fr/tel-03558515/document
	BASE
	Hide details

15	Prosodic Disambiguation Using Chironomic Stylization of Intonation with Native and Non-Native Speakers
	Xiao, Xiao; Audibert, Nicolas; Locqueville, Grégoire...
	In: Proceedings Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329111 ; Interspeech 2021, Aug 2021, Brno (virtual), Czech Republic. pp.516-520, ⟨10.21437/Interspeech.2021-182⟩ (2021)
	BASE
	Show details

16	Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery
	Ondel, Lucas; Yusuf, Bolaji; Burget, Lukáš...
	In: https://hal.archives-ouvertes.fr/hal-03467205 ; 2021 (2021)
	BASE
	Show details

17	Vocal drum sounds in human beatboxing: An acoustic and articulatory exploration using electromagnetic articulography
	Paroni, Annalisa; Henrich Bernardoni, Nathalie; Savariaux, Christophe...
	In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.univ-grenoble-alpes.fr/hal-03107358 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 149 (1), pp.191-206. ⟨10.1121/10.0002921⟩ ; https://asa.scitation.org/doi/full/10.1121/10.0002921 (2021)
	BASE
	Show details

18	Perceptual equivalence of the Liljencrants-Fant and linear-filter glottal flow models
	Perrotin, Olivier; Feugère, Lionel; D'Alessandro, Christophe
	In: ISSN: 0001-4966 ; EISSN: 1520-8524 ; Journal of the Acoustical Society of America ; https://hal.archives-ouvertes.fr/hal-03322875 ; Journal of the Acoustical Society of America, Acoustical Society of America, 2021, 150 (2), pp.1273-1285. ⟨10.1121/10.0005879⟩ ; https://doi.org/10.1121/10.0005879 (2021)
	BASE
	Show details

19	End-to-End Speech Emotion Recognition: Challenges of Real-Life Emergency Call Centers Data Recordings
	Deschamps-Berger, Théo; Lamel, Lori; Devillers, Laurence
	In: ISBN: 978-1-6654-0019-0 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII) ; https://hal.archives-ouvertes.fr/hal-03405970 ; 2021 9th International Conference on Affective Computing and Intelligent Interaction (ACII), Sep 2021, Nara, Japan ; https://www.acii-conf.net/2021/ (2021)
	BASE
	Show details

20	Automated audio captioning by fine-tuning bart with audioset tags
	Gontier, Félix; Serizel, Romain; Cerisara, Christophe
	In: DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events ; https://hal.inria.fr/hal-03522488 ; DCASE 2021 - 6th Workshop on Detection and Classification of Acoustic Scenes and Events, Nov 2021, Virtual, Spain (2021)
	BASE
	Show details

Page: 1 2 3 4 5...8

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern