DE eng

Search in the Catalogues and Directories

Hits 1 – 4 of 4

1
LIBRI-LIGHT: a benchmark for asr with limited or no supervision
In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-02959460 ; ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2020, Barcelona / Virtual, Spain. pp.7669-7673, ⟨10.1109/ICASSP40776.2020.9052942⟩ (2020)
BASE
Show details
2
Data Augmenting Contrastive Learning of Speech Representations in the Time Domain
In: SLT 2020 - IEEE Spoken Language Technology Workshop ; https://hal.archives-ouvertes.fr/hal-03070321 ; SLT 2020 - IEEE Spoken Language Technology Workshop, Dec 2020, Shenzhen / Virtual, China (2020)
Abstract: International audience ; Contrastive Predictive Coding (CPC), based on predicting future segments of speech based on past segments is emerging as a powerful algorithm for representation learning of speech signal. However, it still under-performs other methods on unsupervised evaluation benchmarks. Here, we introduce WavAugment, a time-domain data augmentation library and find that applying augmentation in the past is generally more efficient and yields better performances than other methods. We find that a combination of pitch modification, additive noise and reverberation substantially increase the performance of CPC (relative improvement of 18-22%), beating the reference Libri-light results with 600 times less data. Using an out-of-domain dataset, time-domain data augmentation can push CPC to be on par with the state of the art on the Zero Speech Benchmark 2017. We also show that time-domain data augmentation consistently improves downstream limited-supervision phoneme classification tasks by a factor of 12-15% relative.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; Contrastive predictive coding; Data augmentation; Speech recognition; Unsupervised representation learning
URL: https://hal.archives-ouvertes.fr/hal-03070321
https://hal.archives-ouvertes.fr/hal-03070321/file/2007.00991.pdf
https://hal.archives-ouvertes.fr/hal-03070321/document
BASE
Hide details
3
Unsupervised pretraining transfers well across languages
In: ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-02959418 ; ICASSP 2020 - 2020 IEEE International Conference on Acoustics, Speech and Signal Processing, May 2020, Barcelona / Virtual, Spain. pp.7414-7418, ⟨10.1109/ICASSP40776.2020.9054548⟩ (2020)
BASE
Show details
4
Unsupervised pretraining transfers well across languages ...
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
4
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern