1 |
ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities
|
|
|
|
In: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22) ; https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 ; 2022 (2022)
|
|
Abstract:
International audience ; Whether to retrieve, answer, translate, or reason, multimodality opens up new challenges and perspectives. In this context, we are interested in answering questions about named entities grounded in a visual context using a Knowledge Base (KB). To benchmark this task, called KVQAE (Knowledge-based Visual Question Answering about named Entities), we provide ViQuAE, a dataset of 3.7K questions paired with images. This is the first KVQAE dataset to cover a wide range of entity types (e.g. persons, landmarks, and products). The dataset is annotated using a semi-automatic method. We also propose a KB composed of 1.5M Wikipedia articles paired with images. To set a baseline on the benchmark, we address KVQAE as a two-stage problem: Information Retrieval and Reading Comprehension, with both zero-and few-shot learning methods. The experiments empirically demonstrate the difficulty of the task, especially when questions are not about persons. This work paves the way for better multimodal entity representations and question answering. The dataset, KB, code, and semi-automatic annotation pipeline are freely available at https://github.com/PaulLerner/ViQuAE.
|
|
Keyword:
[INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]; dataset; knowledge-based visual question answering; multimodal
|
|
URL: https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618/document https://doi.org/10.1145/3477495.3531753 https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618/file/lerner_sigir_2022_camera.pdf
|
|
BASE
|
|
Hide details
|
|
4 |
FedQAS: Privacy-Aware Machine Reading Comprehension with Federated Learning
|
|
|
|
In: Applied Sciences; Volume 12; Issue 6; Pages: 3130 (2022)
|
|
BASE
|
|
Show details
|
|
5 |
A Dynamic Attention and Multi-Strategy-Matching Neural Network Based on Bert for Chinese Rice-Related Answer Selection
|
|
|
|
In: Agriculture; Volume 12; Issue 2; Pages: 176 (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Translate Wisely! An Evaluation of Close and Adaptive Translation Procedures in an Experiment Involving Questionnaire Translation
|
|
|
|
In: International journal of sociology ; 51 ; 2 ; 135-162 (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Perspective de la grammaire générative sur l’anaphore
|
|
|
|
In: Corela, Vol 35 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Narrow scoping content question items in shifty contexts: A case of surprising non-quotation in Uyghur
|
|
|
|
In: Proceedings of the Linguistic Society of America; Vol 7, No 1 (2022): Proceedings of the Linguistic Society of America; 5235 ; 2473-8689 (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Quantifying semantic and pragmatic effects on scalar diversity
|
|
|
|
In: Proceedings of the Linguistic Society of America; Vol 7, No 1 (2022): Proceedings of the Linguistic Society of America; 5216 ; 2473-8689 (2022)
|
|
BASE
|
|
Show details
|
|
10 |
Preservice teacher’s purposeful questioning : a descriptive case study of elementary mathematics preservice teachers.
|
|
|
|
BASE
|
|
Show details
|
|
11 |
English machine reading comprehension: new approaches to answering multiple-choice questions
|
|
Dzendzik, Daria. - : Dublin City University. School of Computing, 2021. : Dublin City University. ADAPT, 2021
|
|
In: Dzendzik, Daria (2021) English machine reading comprehension: new approaches to answering multiple-choice questions. PhD thesis, Dublin City University. (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Unsupervised Morphological Segmentation and Part-of-Speech Tagging for Low-Resource Scenarios
|
|
|
|
BASE
|
|
Show details
|
|
13 |
The information-structural status of adjuncts: A QUD-based approach
|
|
|
|
In: Discours ; https://halshs.archives-ouvertes.fr/halshs-03520607 ; Discours, 2021, 28 (2021)
|
|
BASE
|
|
Show details
|
|
14 |
The Role of the Auditory and Visual Modalities in the Perceptual Identification of Brazilian Portuguese Statements and Echo Questions
|
|
|
|
In: ISSN: 0023-8309 ; Language and Speech ; https://hal.archives-ouvertes.fr/hal-02456308 ; Language and Speech, SAGE Publications (UK and US), 2021, 64 (1), pp.3-23. ⟨10.1177/0023830919898886⟩ ; https://journals.sagepub.com/doi/pdf/10.1177/0023830919898886 (2021)
|
|
BASE
|
|
Show details
|
|
15 |
A Multimodal Approach to the Discursive Construction of Stances in Political Debates in Hong Kong
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Geographic Question Answering with Spatially-Explicit Machine Learning Models
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Neural Question Answering Models with Broader Knowledge Scope and Deeper Reasoning Power
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Exploiting multimodality and structure in world representations ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|