1 |
Defending Your Voice: Adversarial Attack on Voice Conversion ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
FragmentVC: Any-to-Any Voice Conversion by End-to-End Extracting and Fusing Fine-Grained Voice Fragments With Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Towards Unsupervised Speech Recognition and Synthesis with Quantized Speech Representation Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
From Semi-supervised to Almost-unsupervised Speech Recognition with Very-low Resource by Jointly Learning Phonetic Structures from Audio and Text Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Improved Speech Separation with Time-and-Frequency Cross-domain Joint Embedding and Clustering ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Multi-target Voice Conversion without Parallel Data by Adversarially Learning Disentangled Audio Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Almost-unsupervised Speech Recognition with Close-to-zero Resource Based on Phonetic Structures Learned from Very Small Unpaired Speech and Text Data ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Segmental Audio Word2Vec: Representing Utterances as Sequences of Vectors with Applications in Spoken Term Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Phonetic-and-Semantic Embedding of Spoken Words with Applications in Spoken Content Retrieval ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Rhythm-Flexible Voice Conversion without Parallel Data Using Cycle-GAN over Phoneme Posteriorgram Sequences ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Improved Audio Embeddings by Adjacency-Based Clustering with Applications in Spoken Term Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
An Iterative Deep Learning Framework for Unsupervised Discovery of Speech Features and Linguistic Units with Applications on Spoken Term Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Audio Word2Vec: Unsupervised Learning of Audio Segment Representations using Sequence-to-sequence Autoencoder ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
A Multi-layered Acoustic Tokenizing Deep Neural Network (MAT-DNN) for Unsupervised Discovery of Linguistic Units and Generation of High Quality Features ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Unsupervised Discovery of Linguistic Structure Including Two-level Acoustic Patterns Using Three Cascaded Stages of Iterative Optimization ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|