1 |
XTREME-S: Evaluating Cross-lingual Speech Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
mSLAM: Massively multilingual joint pre-training for speech and text ...
|
|
Bapna, Ankur; Cherry, Colin; Zhang, Yu; Jia, Ye; Johnson, Melvin; Cheng, Yong; Khanuja, Simran; Riesa, Jason; Conneau, Alexis. - : arXiv, 2022
|
|
Abstract:
We present mSLAM, a multilingual Speech and LAnguage Model that learns cross-lingual cross-modal representations of speech and text by pre-training jointly on large amounts of unlabeled speech and text in multiple languages. mSLAM combines w2v-BERT pre-training on speech with SpanBERT pre-training on character-level text, along with Connectionist Temporal Classification (CTC) losses on paired speech and transcript data, to learn a single model capable of learning from and representing both speech and text signals in a shared representation space. We evaluate mSLAM on several downstream speech understanding tasks and find that joint pre-training with text improves quality on speech translation, speech intent classification and speech language-ID while being competitive on multilingual ASR, when compared against speech-only pre-training. Our speech translation model demonstrates zero-shot text translation without seeing any text translation data, providing evidence for cross-modal alignment of representations. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://arxiv.org/abs/2202.01374 https://dx.doi.org/10.48550/arxiv.2202.01374
|
|
BASE
|
|
Hide details
|
|
3 |
Larger-Scale Transformers for Multilingual Masked Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Multilingual Speech Translation from Efficient Finetuning of Pretrained Models ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Unsupervised Cross-lingual Representation Learning for Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Multilingual Speech Translation with Efficient Finetuning of Pretrained Models ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Unsupervised Cross-lingual Representation Learning at Scale ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Emerging Cross-lingual Structure in Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Specializing distributional vectors of all words for lexical entailment
|
|
|
|
BASE
|
|
Show details
|
|
10 |
What you can cram into a single \$&!#* vector: Probing sentence embeddings for linguistic properties
|
|
|
|
In: ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01898412 ; ACL 2018 - 56th Annual Meeting of the Association for Computational Linguistics, Jul 2018, Melbourne, Australia. pp.2126-2136 (2018)
|
|
BASE
|
|
Show details
|
|
11 |
XNLI: Evaluating Cross-lingual Sentence Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
What you can cram into a single vector: Probing sentence embeddings for linguistic properties ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Very Deep Convolutional Networks for Text Classification
|
|
|
|
In: European Chapter of the Association for Computational Linguistics EACL'17 ; https://hal.archives-ouvertes.fr/hal-01454940 ; European Chapter of the Association for Computational Linguistics EACL'17, 2017, Valencia, Spain (2017)
|
|
BASE
|
|
Show details
|
|
15 |
What you can cram into a single $&!#* vector: probing sentence embeddings for linguistic properties
|
|
|
|
BASE
|
|
Show details
|
|
|
|