Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 5 of 5

1	On Generative Spoken Language Modeling from Raw Audio
	Lakhotia, Kushal; Kharitonov, Evgeny; Hsu, Wei-Ning; Adi, Yossi; Polyak, Adam; Bolte, Benjamin; Nguyen, Tu-Anh; Copet, Jade; Baevski, Alexei; Mohamed, Adelrahman; Dupoux, Emmanuel
	In: EISSN: 2307-387X ; Transactions of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03329219 ; Transactions of the Association for Computational Linguistics, The MIT Press, 2021 (2021)
	Abstract: International audience ; We introduce Generative Spoken Language Modeling, the task of learning the acoustic and linguistic characteristics of a language from raw audio (no text, no labels), and a set of metrics to automatically evaluate the learned representations at acoustic and linguistic levels for both encoding and generation. We set up baseline systems consisting of a discrete speech encoder (returning pseudo-text units), a generative language model (trained on pseudo-text), and a speech decoder (generating a waveform from pseudo-text) all trained without supervision and validate the proposed metrics with human evaluation. Across 3 speech encoders (CPC, wav2vec 2.0, HuBERT), we find that the number of discrete units (50, 100, or 200) matters in a task-dependent and encoder-dependent way, and that some combinations approach text-based systems.
	Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
	URL: https://hal.inria.fr/hal-03329219/file/2102.01192.pdf https://hal.inria.fr/hal-03329219 https://hal.inria.fr/hal-03329219/document
	BASE
	Hide details

2	Towards Interactive Language Modeling ...
	ter Hoeve, Maartje; Kharitonov, Evgeny; Hupkes, Dieuwke. - : arXiv, 2021
	BASE
	Show details

3	Generative Spoken Language Modeling from Raw Audio ...
	Lakhotia, Kushal; Kharitonov, Evgeny; Hsu, Wei-Ning. - : arXiv, 2021
	BASE
	Show details

4	The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling
	Nguyen, Tu Anh; De Seyssel, Maureen; Rozé, Patricia...
	In: NeuRIPS Workshop on Self-Supervised Learning for Speech and Audio Processing ; https://hal.archives-ouvertes.fr/hal-03070362 ; NeuRIPS Workshop on Self-Supervised Learning for Speech and Audio Processing, Dec 2020, Virtuel, France (2020)
	BASE
	Show details

5	The Zero Resource Speech Benchmark 2021: Metrics and baselines for unsupervised spoken language modeling ...
	Nguyen, Tu Anh; de Seyssel, Maureen; Rozé, Patricia. - : arXiv, 2020
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern