1 |
Speaker information modification in the VoicePrivacy 2020 toolchain
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-02995855 ; [Research Report] INRIA Nancy, équipe Multispeech; LIUM - Laboratoire d'Informatique de l'Université du Mans. 2020 (2020)
|
|
Abstract:
This paper presents a study of the baseline system of the VoicePrivacy 2020 challenge. This baseline relies on a voice conversion system that aims at separating speaker identity and linguistic contents for a given speech utterance. To generate an anonymized speech waveform, the neural acoustic model and neural waveform model use the related linguistic content together with a selected pseudo-speaker identity. The linguistic content is estimated using bottleneck features extracted from a triphone classifier while the speaker information is extracted then modified to target a pseudo-speaker identity in the x-vector's space. In this work, we first proposed to replace the triphone-based bottleneck features extractor that requires supervised training by an end-to-end Automatic Speech Recognition (ASR) system. In this framework, we explored the use of adver-sarial and semi-adversarial training to learn linguistic features while masking speaker information. Last, we explored several anonymization schemes to introspect which module benefits the most from the generated pseudo-speaker identities.
|
|
Keyword:
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; Speaker anonymization; Speech recognition; VoicePrivacy 2020 Challenge
|
|
URL: https://hal.archives-ouvertes.fr/hal-02995855/document https://hal.archives-ouvertes.fr/hal-02995855/file/MultiSpeech.pdf https://hal.archives-ouvertes.fr/hal-02995855
|
|
BASE
|
|
Hide details
|
|
2 |
About vocabulary adaptation for automatic speech recognition of video data
|
|
|
|
In: ICNLSSP'2017 - International Conference on Natural Language, Signal and Speech Processing ; https://hal.inria.fr/hal-01649057 ; ICNLSSP'2017 - International Conference on Natural Language, Signal and Speech Processing, Dec 2017, Casablanca, Morocco. pp.1-5 (2017)
|
|
BASE
|
|
Show details
|
|
3 |
Development of the Arabic Loria Automatic Speech Recognition system (ALASR) and its evaluation for Algerian dialect
|
|
|
|
In: ACLing 2017 - 3rd International Conference on Arabic Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01583842 ; ACLing 2017 - 3rd International Conference on Arabic Computational Linguistics, Nov 2017, Dubai, United Arab Emirates. pp.1-8 (2017)
|
|
BASE
|
|
Show details
|
|
4 |
Reconnaissance de la parole pour l’aide à la communication pour les sourds et malentendants ; Speech recognition as a communication aid for deaf and hearing impaired people
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Explicit trajectories and speaker class modeling for child and adult speech recognition ; Modélisation de trajectoires et de classes de locuteurs pour la reconnaissance de voix d'enfants et d'adultes
|
|
|
|
In: XXXème édition des Journées d'Etudes sur la Parole ; https://hal.inria.fr/hal-01080343 ; XXXème édition des Journées d'Etudes sur la Parole, Jun 2014, Le Mans, France (2014)
|
|
BASE
|
|
Show details
|
|
|
|