1 |
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
|
|
|
|
In: INTERSPEECH 2021: Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03317730 ; INTERSPEECH 2021: Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
2 |
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
|
|
|
|
In: INTERSPEECH 2021: ; INTERSPEECH 2021: Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03317730 ; INTERSPEECH 2021: Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
3 |
LeBenchmark: A Reproducible Framework for Assessing Self-Supervised Representation Learning from Speech
|
|
|
|
In: INTERSPEECH 2021: ; INTERSPEECH 2021: Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03317730 ; INTERSPEECH 2021: Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
4 |
The contribution of visual articulatory gestures and orthography to speech processing: Evidence from novel word learning
|
|
|
|
In: ISSN: 0278-7393 ; EISSN: 1939-1285 ; Journal of Experimental Psychology: Learning, Memory, and Cognition ; https://hal.archives-ouvertes.fr/hal-03189083 ; Journal of Experimental Psychology: Learning, Memory, and Cognition, American Psychological Association, In press, ⟨10.1037/xlm0001036⟩ (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Learning robust speech representation with an articulatory-regularized variational autoencoder
|
|
|
|
In: Proccedings of Interspeech 2021 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-03373252 ; Interspeech 2021 - 22nd Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Do Infants Really Learn Phonetic Categories?
|
|
|
|
In: EISSN: 2470-2986 ; Open Mind ; https://hal.archives-ouvertes.fr/hal-03550830 ; Open Mind, MIT Press, 2021, 5, pp.113-131. ⟨10.1162/opmi_a_00046⟩ (2021)
|
|
BASE
|
|
Show details
|
|
9 |
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
A multispeaker dataset of raw and reconstructed speech production real-time MRI video and 3D volumetric images ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Models and activations for "Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? - A computational investigation" ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Models and activations for "Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? - A computational investigation" ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Manual praxis and language-production networks: An fMRI dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? - A computational investigation ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Le discours médiatique comme relation de pouvoir symbolique : pratiques de médiatisation de la diaspora
|
|
|
|
In: Argumentum: Journal of the Seminar of Discursive Logic, Argumentation Theory and Rhetoric, Vol 19, Iss 1, Pp 45-65 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Towards unsupervised learning of speech features in the wild
|
|
|
|
In: SLT 2020 : IEEE Spoken Language Technology Workshop ; https://hal.archives-ouvertes.fr/hal-03070411 ; SLT 2020 : IEEE Spoken Language Technology Workshop, Dec 2020, Shenzhen / Virtual, China (2020)
|
|
BASE
|
|
Show details
|
|
19 |
Evaluating the reliability of acoustic speech embeddings
|
|
|
|
In: INTERSPEECH 2020 - Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-02977539 ; INTERSPEECH 2020 - Annual Conference of the International Speech Communication Association, Oct 2020, Shanghai / Vitrtual, China (2020)
|
|
Abstract:
International audience ; Speech embeddings are fixed-size acoustic representations of variable-length speech sequences. They are increasingly used for a variety of tasks ranging from information retrieval to un-supervised term discovery and speech segmentation. However, there is currently no clear methodology to compare or optimize the quality of these embeddings in a task-neutral way. Here, we systematically compare two popular metrics, ABX discrimination and Mean Average Precision (MAP), on 5 languages across 17 embedding methods, ranging from supervised to fully unsu-pervised, and using different loss functions (autoencoders, cor-respondance autoencoders, siamese). Then we use the ABX and MAP to predict performances on a new downstream task: the unsupervised estimation of the frequencies of speech segments in a given corpus. We find that overall, ABX and MAP correlate with one another and with frequency estimation. However, substantial discrepancies appear in the fine-grained distinctions across languages and/or embedding methods. This makes it un-realistic at present to propose a task-independent silver bullet method for computing the intrinsic quality of speech embed-dings. There is a need for more detailed analysis of the metrics currently used to evaluate such embeddings.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Evaluation metrics; Frequency estimation; k-nearest neighbours; Representation learning; Speech embeddings; Unsupervised speech processing
|
|
URL: https://hal.inria.fr/hal-02977539 https://hal.inria.fr/hal-02977539/file/Thu-3-2-6.pdf https://hal.inria.fr/hal-02977539/document
|
|
BASE
|
|
Hide details
|
|
20 |
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
|
|
|
|
In: SLT 2020 - IEEE Spoken Language Technology Workshop ; https://hal.archives-ouvertes.fr/hal-03070321 ; SLT 2020 - IEEE Spoken Language Technology Workshop, Dec 2020, Shenzhen / Virtual, China (2020)
|
|
BASE
|
|
Show details
|
|
|
|