DE eng

Search in the Catalogues and Directories

Hits 1 – 11 of 11

1
MAGIC DUST FOR CROSS-LINGUAL ADAPTATION OF MONOLINGUAL WAV2VEC-2.0
In: ICASSP 2022 ; https://hal.archives-ouvertes.fr/hal-03544515 ; ICASSP 2022, May 2022, Singapour, Singapore (2022)
BASE
Show details
2
Magic dust for cross-lingual adaptation of monolingual wav2vec-2.0 ...
BASE
Show details
3
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
In: Interspeech 2020 ; https://hal.archives-ouvertes.fr/hal-02912029 ; Interspeech 2020, Oct 2020, Shanghai, China (2020)
Abstract: International audience ; Probabilistic Latent Variable Models (LVMs) provide an alternative to self-supervised learning approaches for linguistic representation learning from speech. LVMs admit an intuitive probabilistic interpretation where the latent structure shapes the information extracted from the signal. Even though LVMs have recently seen a renewed interest due to the introduction of Vari-ational Autoencoders (VAEs), their use for speech representation learning remains largely unexplored. In this work, we propose Convolutional Deep Markov Model (ConvDMM), a Gaus-sian state-space model with non-linear emission and transition functions modelled by deep neural networks. This unsupervised model is trained using black box variational inference. A deep convolutional neural network is used as an inference network for structured variational approximation. When trained on a large scale speech dataset (LibriSpeech), ConvDMM produces features that significantly outperform multiple self-supervised feature extracting methods on linear phone classification and recognition on the Wall Street Journal dataset. Furthermore, we found that ConvDMM complements self-supervised methods like Wav2Vec and PASE, improving on the results achieved with any of the methods alone. Lastly, we find that ConvDMM features enable learning better phone recognizers than any other features in an extreme low-resource regime with few labelled training examples.
Keyword: [INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]; [INFO.INFO-NE]Computer Science [cs]/Neural and Evolutionary Computing [cs.NE]; Neural Variational Latent Variable Model; Structured Variational Inference; Unsupervised Speech Representation Learning
URL: https://hal.archives-ouvertes.fr/hal-02912029/file/convDMM_arxiv.pdf
https://hal.archives-ouvertes.fr/hal-02912029/document
https://hal.archives-ouvertes.fr/hal-02912029
BASE
Hide details
4
CSTNet: Contrastive Speech Translation Network for Self-Supervised Speech Representation Learning ...
BASE
Show details
5
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning ...
BASE
Show details
6
DARTS: Dialectal Arabic Transcription System ...
BASE
Show details
7
The Summa Platform Prototype ...
BASE
Show details
8
The Summa Platform Prototype ...
BASE
Show details
9
The SUMMA Platform Prototype
In: http://infoscience.epfl.ch/record/233575 (2017)
BASE
Show details
10
Multi-view Dimensionality Reduction for Dialect Identification of Arabic Broadcast Speech ...
BASE
Show details
11
Automatic Dialect Detection in Arabic Broadcast Speech ...
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
11
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern