1 |
Differentially private speaker anonymization
|
|
|
|
In: https://hal.inria.fr/hal-03588932 ; 2022 (2022)
|
|
Abstract:
Sharing real-world speech utterances is key to the training and deployment of voice-based services. However, it also raises privacy risks as speech contains a wealth of personal data. Speaker anonymization aims to remove speaker information from a speech utterance while leaving its linguistic and prosodic attributes intact. State-of-the-art techniques operate by disentangling the speaker information (represented via a speaker embedding) from these attributes and re-synthesizing speech based on the speaker embedding of another speaker. Prior research in the privacy community has shown that anonymization often provides brittle privacy protection, even less so any provable guarantee. In this work, we show that disentanglement is indeed not perfect: linguistic and prosodic attributes still contain speaker information. We remove speaker information from these attributes by introducing differentially private feature extractors based on an autoencoder and an automatic speech recognizer, respectively, trained using noise layers. We plug these extractors in the state-of-the-art anonymization pipeline and generate, for the first time, differentially private utterances with a provable upper bound on the speaker information they contain. We evaluate empirically the privacy and utility resulting from our differentially private speaker anonymization approach on the LibriSpeech data set. Experimental results show that the generated utterances retain very high utility for automatic speech recognition training and inference, while being much better protected against strong adversaries who leverage the full knowledge of the anonymization process to try to infer the speaker identity.
|
|
Keyword:
[INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing
|
|
URL: https://hal.inria.fr/hal-03588932
|
|
BASE
|
|
Hide details
|
|
2 |
Privacy and utility of x-vector based speaker anonymization
|
|
|
|
In: https://hal.inria.fr/hal-03197376 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Enhancing Speech Privacy with Slicing
|
|
|
|
In: https://hal.inria.fr/hal-03369137 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
4 |
D-Cliques: Compensating for Data Heterogeneity with Topology in Decentralized Federated Learning
|
|
|
|
In: https://hal.inria.fr/hal-03498160 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Privacy and utility of x-vector based speaker anonymization
|
|
|
|
In: https://hal.inria.fr/hal-03197376 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Privacy Amplification by Decentralization
|
|
|
|
In: https://hal.inria.fr/hal-03100005 ; 2020 (2020)
|
|
BASE
|
|
Show details
|
|
7 |
Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties
|
|
|
|
In: NeurIPS 2020 workshop on Privacy Preserving Machine Learning - PriML and PPML Joint Edition ; https://hal.archives-ouvertes.fr/hal-03117816 ; NeurIPS 2020 workshop on Privacy Preserving Machine Learning - PriML and PPML Joint Edition, Dec 2020, Vancouver (Virtual Workshop), Canada ; https://ppml-workshop.github.io/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Privacy Amplification by Decentralization
|
|
|
|
In: https://hal.inria.fr/hal-03100005 ; 2020 (2020)
|
|
BASE
|
|
Show details
|
|
9 |
Evaluating Voice Conversion-based Privacy Protection against Informed Attackers
|
|
|
|
In: ICASSP 2020 - 45th International Conference on Acoustics, Speech, and Signal Processing ; https://hal.inria.fr/hal-02355115 ; ICASSP 2020 - 45th International Conference on Acoustics, Speech, and Signal Processing, IEEE Signal Processing Society, May 2020, Barcelona, Spain. pp.2802-2806 (2020)
|
|
BASE
|
|
Show details
|
|
10 |
Fully Decentralized Joint Learning of Personalized Models and Collaboration Graphs
|
|
|
|
In: AISTATS 2020 - The 23rd International Conference on Artificial Intelligence and Statistics ; https://hal.inria.fr/hal-03100057 ; AISTATS 2020 - The 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palerme / Virtual, Italy ; https://aistats.org/aistats2020/ (2020)
|
|
BASE
|
|
Show details
|
|
11 |
Échange de bruit corrélé pour le calcul distribué de moyenne avec garanties de confidentialité différentielle
|
|
|
|
In: Conférence sur l'Apprentissage Automatique 2020 ; https://hal.archives-ouvertes.fr/hal-03117907 ; Conférence sur l'Apprentissage Automatique 2020, Jun 2020, Vannes (Virtual), France ; https://cap-rfiap2020.sciencesconf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Distributed Differentially Private Averaging with Improved Utility and Robustness to Malicious Parties
|
|
|
|
In: https://hal.inria.fr/hal-03100019 ; 2020 (2020)
|
|
BASE
|
|
Show details
|
|
13 |
Who started this rumor? Quantifying the natural differential privacy guarantees of gossip protocols
|
|
|
|
In: DISC 2020 - 34th International Symposium on Distributed Computing ; https://hal.inria.fr/hal-02166432 ; DISC 2020 - 34th International Symposium on Distributed Computing, Oct 2020, Freiburg / Virtual, Germany (2020)
|
|
BASE
|
|
Show details
|
|
14 |
Private Protocols for U-Statistics in the Local Model and Beyond
|
|
|
|
In: AISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics ; https://hal.inria.fr/hal-02310236 ; AISTATS 2020 - 23rd International Conference on Artificial Intelligence and Statistics, Aug 2020, Palermo, Italy (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Design Choices for X-vector Based Speaker Anonymization
|
|
|
|
In: INTERSPEECH 2020 ; https://hal.archives-ouvertes.fr/hal-02610447 ; INTERSPEECH 2020, International Speech Communication Association (ISCA), Oct 2020, Shanghai, China (2020)
|
|
BASE
|
|
Show details
|
|
16 |
A comparative study of speech anonymization metrics
|
|
|
|
In: INTERSPEECH 2020 ; https://hal.inria.fr/hal-02907918 ; INTERSPEECH 2020, Oct 2020, Shanghai, China (2020)
|
|
BASE
|
|
Show details
|
|
17 |
Joint Learning of the Graph and the Data Representation for Graph-Based Semi-Supervised Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Privacy-Preserving Adversarial Representation Learning in ASR: Reality or Illusion?
|
|
|
|
In: INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-02166434 ; INTERSPEECH 2019 - 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria (2019)
|
|
BASE
|
|
Show details
|
|
19 |
A Probabilistic Model for Joint Learning of Word Embeddings from Texts and Images
|
|
|
|
In: Conference on Empirical Methods in Natural Language Processing (EMNLP 2018) ; https://hal.inria.fr/hal-01922985 ; Conference on Empirical Methods in Natural Language Processing (EMNLP 2018), 2018, Brussels, Belgium (2018)
|
|
BASE
|
|
Show details
|
|
|
|