DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 22

1
IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages ...
Abstract: Reliable evaluation benchmarks designed for replicability and comprehensiveness have driven progress in machine learning. Due to the lack of a multilingual benchmark, however, vision-and-language research has mostly focused on English language tasks. To fill this gap, we introduce the Image-Grounded Language Understanding Evaluation benchmark. IGLUE brings together - by both aggregating pre-existing datasets and creating new ones - visual question answering, cross-modal retrieval, grounded reasoning, and grounded entailment tasks across 20 diverse languages. Our benchmark enables the evaluation of multilingual multimodal models for transfer learning, not only in a zero-shot setting, but also in newly defined few-shot learning setups. Based on the evaluation of the available state-of-the-art models, we find that translate-test transfer is superior to zero-shot transfer and that few-shot learning is hard to harness for many tasks. Moreover, downstream performance is partially explained by the amount of ...
Keyword: Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
URL: https://dx.doi.org/10.48550/arxiv.2201.11732
https://arxiv.org/abs/2201.11732
BASE
Hide details
2
MDAPT: Multilingual Domain Adaptive Pretraining in a Single Model ...
BASE
Show details
3
Vision-and-Language or Vision-for-Language? On Cross-Modal Influence in Multimodal Transformers ...
BASE
Show details
4
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs ...
BASE
Show details
5
Multimodal pretraining unmasked: A meta-analysis and a unified framework of vision-and-language berts ...
BASE
Show details
6
mDAPT: Multilingual Domain Adaptive Pretraining in a Single Model ...
BASE
Show details
7
Visually Grounded Reasoning across Languages and Cultures ...
BASE
Show details
8
Visually Grounded Reasoning across Languages and Cultures ...
BASE
Show details
9
Multimodal pretraining unmasked: A meta-analysis and a unified framework of vision-and-language berts
In: Transactions of the Association for Computational Linguistics, 9 (2021)
BASE
Show details
10
The Role of Syntactic Planning in Compositional Image Captioning ...
BASE
Show details
11
Visually Grounded Reasoning across Languages and Cultures ...
BASE
Show details
12
CompGuessWhat?!: A Multi-task Evaluation Framework for Grounded Language Learning ...
BASE
Show details
13
The Sensitivity of Language Models and Humans to Winograd Schema Perturbations ...
BASE
Show details
14
Bootstrapping Disjoint Datasets for Multilingual Multimodal Representation Learning ...
BASE
Show details
15
Cross-lingual Visual Verb Sense Disambiguation ...
BASE
Show details
16
Lessons learned in multilingual grounded language learning ...
BASE
Show details
17
Findings of the Second Shared Task on Multimodal Machine Translation and Multilingual Image Description ...
BASE
Show details
18
Cross-linguistic differences and similarities in image descriptions ...
BASE
Show details
19
Multi30K: Multilingual English-German Image Descriptions
Elliott, Desmond [Verfasser]; Frank, Stella [Verfasser]; Sima'an, Khalil [Verfasser]. - Aachen : Universitätsbibliothek der RWTH Aachen, 2016
DNB Subject Category Language
Show details
20
Multi30K: Multilingual English-German Image Descriptions ...
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
1
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
21
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern