1 |
Multilingual Unsupervised Sentence Simplification
|
|
|
|
In: https://hal.inria.fr/hal-03109299 ; 2021 (2021)
|
|
BASE
|
|
Show details
|
|
2 |
Text Generation with and without Retrieval ; Génération de textes basés sur la connaissance avec et sans recherche
|
|
|
|
In: https://hal.univ-lorraine.fr/tel-03542634 ; Computer Science [cs]. Université de Lorraine, 2021. English. ⟨NNT : 2021LORR0164⟩ (2021)
|
|
BASE
|
|
Show details
|
|
3 |
The FLORES-101 Evaluation Benchmark for Low-Resource and Multilingual Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Findings of the AmericasNLP 2021 Shared Task on Open Machine Translation for Indigenous Languages of the Americas ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Alternative Input Signals Ease Transfer in Multilingual Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Multilingual AMR-to-Text Generation
|
|
|
|
In: 2020 Conference on Empirical Methods in Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-02999676 ; 2020 Conference on Empirical Methods in Natural Language Processing, Nov 2020, Punta Cana, Dominican Republic (2020)
|
|
BASE
|
|
Show details
|
|
10 |
Augmenting Transformers with KNN-Based Composite Memory for Dialog
|
|
|
|
In: EISSN: 2307-387X ; Transactions of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02999678 ; Transactions of the Association for Computational Linguistics, The MIT Press, In press, ⟨10.1162/tacl_a_00356⟩ ; https://transacl.org/index.php/tacl (2020)
|
|
BASE
|
|
Show details
|
|
11 |
Multilingual Translation with Extensible Multilingual Pretraining and Finetuning ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Facebook AI's WMT20 News Translation Task Submission ...
|
|
|
|
Abstract:
This paper describes Facebook AI's submission to WMT20 shared news translation task. We focus on the low resource setting and participate in two language pairs, Tamil English and Inuktitut English, where there are limited out-of-domain bitext and monolingual data. We approach the low resource problem using two main strategies, leveraging all available data and adapting the system to the target news domain. We explore techniques that leverage bitext and monolingual data from all languages, such as self-supervised model pretraining, multilingual models, data augmentation, and reranking. To better adapt the translation system to the test domain, we explore dataset tagging and fine-tuning on in-domain data. We observe that different techniques provide varied improvements based on the available data of the language pair. Based on the finding, we integrate these techniques into one training pipeline. For En->Ta, we explore an unconstrained setup with additional Tamil bitext and monolingual data and show that ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2011.08298 https://arxiv.org/abs/2011.08298
|
|
BASE
|
|
Hide details
|
|
15 |
Beyond English-Centric Multilingual Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
MUSS: Multilingual Unsupervised Sentence Simplification by Mining Paraphrases ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|