1 |
FST: the FAIR Speech Translation System for the IWSLT21 Multilingual Shared Task ...
|
|
|
|
Abstract:
In this paper, we describe our end-to-end multilingual speech translation system submitted to the IWSLT 2021 evaluation campaign on the Multilingual Speech Translation shared task. Our system is built by leveraging transfer learning across modalities, tasks and languages. First, we leverage general-purpose multilingual modules pretrained with large amounts of unlabelled and labelled data. We further enable knowledge transfer from the text task to the speech task by training two tasks jointly. Finally, our multilingual model is finetuned on speech translation task-specific data to achieve the best translation results. Experimental results show our system outperforms the reported systems, including both end-to-end and cascaded based approaches, by a large margin. In some translation directions, our speech translation results evaluated on the public Multilingual TEDx test set are even comparable with the ones from a strong text-to-text translation system, which uses the oracle speech transcripts as input. ... : Accepted by IWSLT 2021 as a system paper ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
|
|
URL: https://arxiv.org/abs/2107.06959 https://dx.doi.org/10.48550/arxiv.2107.06959
|
|
BASE
|
|
Hide details
|
|
2 |
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the Web ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
WikiMatrix: Mining 135M Parallel Sentences in 1620 Language Pairs from Wikipedia ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
MLQA: Evaluating Cross-lingual Extractive Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 7, Pp 597-610 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Massively Multilingual Sentence Embeddings for Zero-Shot Cross-Lingual Transfer and Beyond ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
XNLI: Evaluating Cross-lingual Sentence Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Filtering and Mining Parallel Data in a Joint Multilingual Space ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
A Corpus for Multilingual Document Classification in Eight Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Very Deep Convolutional Networks for Text Classification
|
|
|
|
In: European Chapter of the Association for Computational Linguistics EACL'17 ; https://hal.archives-ouvertes.fr/hal-01454940 ; European Chapter of the Association for Computational Linguistics EACL'17, 2017, Valencia, Spain (2017)
|
|
BASE
|
|
Show details
|
|
14 |
Learning Joint Multilingual Sentence Representations with Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
OCR Error Correction Using Statistical Machine Translation
|
|
|
|
In: 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2015). ; https://hal.archives-ouvertes.fr/hal-01433200 ; 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2015)., 2015, Cairo, Egypt (2015)
|
|
BASE
|
|
Show details
|
|
16 |
Continuous Adaptation to User Feedback for Statistical Machine Translation
|
|
|
|
In: North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015) ; https://hal.archives-ouvertes.fr/hal-01454944 ; North American Chapter of the Association for Computational Linguistics – Human Language Technologies (NAACL HLT 2015), 2015, Denver (Colorado, USA), Unknown Region (2015)
|
|
BASE
|
|
Show details
|
|
17 |
Continuous adaptation to user feedback for statistical machine translation
|
|
|
|
In: 1001 ; 1005 (2015)
|
|
BASE
|
|
Show details
|
|
20 |
Translation project adaptation for MT-enhanced computer assisted translation
|
|
|
|
In: ISSN: 0922-6567 ; EISSN: 1573-0573 ; Machine Translation ; https://hal.archives-ouvertes.fr/hal-01157893 ; Machine Translation, Springer Verlag, 2014, Machine Translation Journal, 28, pp.127. ⟨10.1007/s10590-014-9152-1⟩ (2014)
|
|
BASE
|
|
Show details
|
|
|
|