DE eng

Search in the Catalogues and Directories

Hits 1 – 2 of 2

1
Delving Deeper into Cross-lingual Visual Question Answering ...
BASE
Show details
2
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages ...
Abstract: Reliable evaluation benchmarks designed for replicability and comprehensiveness have driven progress in machine learning. Due to the lack of a multilingual benchmark, however, vision-and-language research has mostly focused on English language tasks. To fill this gap, we introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together - by both aggregating pre-existing datasets and creating new ones - visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups. Based on the evaluation of the available state-of-the-art models, we find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks. Moreover, downstream performance is partially explained by the amount of ...
Keyword: Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
URL: https://dx.doi.org/10.48550/arxiv.2201.11732
https://arxiv.org/abs/2201.11732
BASE
Hide details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern