DE eng

Search in the Catalogues and Directories

Hits 1 – 13 of 13

1
Leveraging Passage Retrieval with Generative Models for Open Domain Question Answering
In: EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03463108 ; EACL 2021 - 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kiev, Ukraine. pp.874-880, ⟨10.18653/v1/2021.eacl-main.74⟩ (2021)
BASE
Show details
2
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the Web ...
Abstract: Read paper: https://www.aclanthology.org/2021.acl-long.507 Abstract: We show that margin-based bitext mining in a multilingual sentence space can be successfully scaled to operate on monolingual corpora of billions of sentences. We use 32 snapshots of a curated common crawl corpus (Wenzel et al, 2019) totaling 71 billion unique sentences. Using one unified approach for 90 languages, we were able to mine 10.8 billion parallel sentences, out of which only 2.9 billions are aligned with English. We illustrate the capability of our scalable mining system to create high quality training sets from one language to any other by training hundreds of different machine translation models and evaluating them on the many-to-many TED benchmark. Further, we evaluate on competitive translation benchmarks such as WMT and WAT. Using only mined bitext, we set a new state of the art for a single system on the WMT'19 test set for English-German/Russian/Chinese. In particular, our English/German and English/Russian systems ...
URL: https://dx.doi.org/10.48448/z2vp-9188
https://underline.io/lecture/25726-ccmatrix-mining-billions-of-high-quality-parallel-sentences-on-the-web
BASE
Hide details
3
Beyond English-Centric Multilingual Machine Translation ...
BASE
Show details
4
Unsupervised Cross-lingual Representation Learning at Scale ...
BASE
Show details
5
CCMatrix: Mining Billions of High-Quality Parallel Sentences on the WEB ...
BASE
Show details
6
Don't Forget the Long Tail! A Comprehensive Analysis of Morphological Generalization in Bilingual Lexicon Induction ...
BASE
Show details
7
Colorless green recurrent networks dream hierarchically
In: Proceedings of the Society for Computation in Linguistics (2019)
BASE
Show details
8
Unsupervised Hyperalignment for Multilingual Word Embeddings ...
BASE
Show details
9
Loss in Translation: Learning Bilingual Word Mapping with a Retrieval Criterion ...
BASE
Show details
10
Unsupervised Alignment of Embeddings with Wasserstein Procrustes ...
BASE
Show details
11
Colorless green recurrent networks dream hierarchically ...
BASE
Show details
12
Colorless green recurrent networks dream hierarchically
In: Proceedings of the 2018 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, NAACL-HLT 2018 P. 1195–1205 (2018)
BASE
Show details
13
A Markovian approach to distributional semantics with application to semantic compositionality
In: International Conference on Computational Linguistics (Coling) ; https://hal.inria.fr/hal-01080309 ; International Conference on Computational Linguistics (Coling), International Committee on Computational Linguistics (ICCL), Aug 2014, Dublin, Ireland. pp.1447 - 1456 ; http://www.coling-2014.org/ (2014)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
13
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern