DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...60
Hits 1 – 20 of 1.184

1
ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities
In: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22) ; https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 ; 2022 (2022)
Abstract: International audience ; Whether to retrieve, answer, translate, or reason, multimodality opens up new challenges and perspectives. In this context, we are interested in answering questions about named entities grounded in a visual context using a Knowledge Base (KB). To benchmark this task, called KVQAE (Knowledge-based Visual Question Answering about named Entities), we provide ViQuAE, a dataset of 3.7K questions paired with images. This is the first KVQAE dataset to cover a wide range of entity types (e.g. persons, landmarks, and products). The dataset is annotated using a semi-automatic method. We also propose a KB composed of 1.5M Wikipedia articles paired with images. To set a baseline on the benchmark, we address KVQAE as a two-stage problem: Information Retrieval and Reading Comprehension, with both zero-and few-shot learning methods. The experiments empirically demonstrate the difficulty of the task, especially when questions are not about persons. This work paves the way for better multimodal entity representations and question answering. The dataset, KB, code, and semi-automatic annotation pipeline are freely available at https://github.com/PaulLerner/ViQuAE.
Keyword: [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]; dataset; knowledge-based visual question answering; multimodal
URL: https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618
https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618/document
https://doi.org/10.1145/3477495.3531753
https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618/file/lerner_sigir_2022_camera.pdf
BASE
Hide details
2
Efficiency of Use of Internet Resources in Teaching a Foreign Language at Non-Linguistic Universities ...
Editor Academic Journals & Conferences. - : Open Science Framework, 2022
BASE
Show details
3
Unsupervised quantification of entity consistency between photos and text in real-world news ...
Müller-Budack, Eric. - : Hannover : Institutionelles Repositorium der Leibniz Universität Hannover, 2022
BASE
Show details
4
О РОЛИ ПРЕЗЕНТАЦИИ ПРИ ОБУЧЕНИИ ИНОСТРАННОМУ ЯЗЫКУ В СФЕРЕ ПРОФЕССИОНАЛЬНОЙ КОММУНИКАЦИИ ... : THE ROLE OF PRESENTATION IN TEACHING A FOREIGN LANGUAGE IN THE FIELD OF PROFESSIONAL COMMUNICATION ...
Куклина А.И.. - : The Scientific Heritage, 2022
BASE
Show details
5
Sign Language Recognition System using TensorFlow Object Detection API ...
BASE
Show details
6
hate-alert@DravidianLangTech-ACL2022: Ensembling Multi-Modalities for Tamil TrollMeme Classification ...
BASE
Show details
7
3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos ...
BASE
Show details
8
Uni-EDEN: Universal Encoder-Decoder Network by Multi-Granular Vision-Language Pre-training ...
Li, Yehao; Fan, Jiahao; Pan, Yingwei. - : arXiv, 2022
BASE
Show details
9
Chain-based Discriminative Autoencoders for Speech Recognition ...
BASE
Show details
10
Multimedia Interventions for Neurodiversity: Leveraging Insights from Developmental Cognitive Neuroscience to Build an Innovative Practice
In: Brain Sciences; Volume 12; Issue 2; Pages: 147 (2022)
BASE
Show details
11
COVID-19 and cyberbullying: deep ensemble model to identify cyberbullying from code-switched languages during the pandemic
In: Multimed Tools Appl (2022)
BASE
Show details
12
FaceTuneGAN: Face Autoencoder for Convolutional Expression Transfer Using Neural Generative Adversarial Networks
In: https://hal.inria.fr/hal-03462778 ; 2021 (2021)
BASE
Show details
13
The L2L system for second language learning using visualised zoom calls among students
In: Dey-Plissonneau, Aparajita, Lee, Hyowon orcid:0000-0003-4395-7702 , Pradier, Vincent orcid:0000-0002-7050-6408 , Scriney, Michael orcid:0000-0001-6813-2630 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2021) The L2L system for second language learning using visualised zoom calls among students. In: 16th European Conference on Technology-Enhanced Learning EC-TEL 2021, 20-24 Sept 2021, Bozen-Bolzano, Italy (Online). ISBN 978-3-030-86435-4 (2021)
BASE
Show details
14
Utilising visual attention cues for vehicle detection and tracking
In: Hu, Feiyan orcid:0000-0001-7451-6438 , Gurram Munirathnam, Venkatesh orcid:0000-0002-4393-9267 , O'Connor, Noel E. orcid:0000-0002-4033-9135 , Smeaton, Alan F. orcid:0000-0003-1028-8389 and Little, Suzanne orcid:0000-0003-3281-3471 (2021) Utilising visual attention cues for vehicle detection and tracking. In: 25th International Conference on Pattern Recognition (ICPR2020), 10-15 Jan 2021, Milan, Italy (Online). (2021)
BASE
Show details
15
Attention based video summaries of live online zoom classes
In: Lee, Hyowon orcid:0000-0003-4395-7702 , Liu, Mingming orcid:0000-0002-8988-2104 , Riaz, Hamza, Rajasekaran, Navaneethan, Scriney, Michael orcid:0000-0001-6813-2630 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2021) Attention based video summaries of live online zoom classes. In: AAAI-2021 Workshop on AI Education: "Imagining Post-COVID Education with AI" (TIPCE-2021)., 9 Feb 2021, Online (Vancouver, Canada). (In Press) (2021)
BASE
Show details
16
Supporting an effective review of telecollaboration for second language learning by visualising the participation and engagement at Dublin City University
In: Lee, Hyowon orcid:0000-0003-4395-7702 , Scriney, Michael orcid:0000-0001-6813-2630 , Dey-Plissonneau, Aparajita and Smeaton, Alan orcid:0000-0003-1028-8389 (2021) Supporting an effective review of telecollaboration for second language learning by visualising the participation and engagement at Dublin City University. In: Virtual Exchange in Higher Education: Charting the Irish Experience, 17 Sept 2021, Online vs MS Teams. (2021)
BASE
Show details
17
Designer Minds: Examining Youths’ Multimodal Literacies
Lew, Lilly Chung. - : eScholarship, University of California, 2021
BASE
Show details
18
Leveraging lyrics from audio for MIR ; Exploiter les paroles de chansons à partir de l'audio pour le MIR
Vaglio, Andrea. - : HAL CCSD, 2021
In: https://tel.archives-ouvertes.fr/tel-03558515 ; Signal and Image processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAT027⟩ (2021)
BASE
Show details
19
Overview of LifeCLEF 2021: an evaluation of Machine-Learning based Species Identification and Species Distribution Prediction
In: Experimental IR Meets Multilinguality, Multimodality, and Interaction ; https://hal.inria.fr/hal-03415990 ; K. Selçuk Candan; Bogdan Ionescu; Lorraine Goeuriot; Birger Larsen; Henning Müller; Alexis Joly; Maria Maistro; Florina Piroi; Guglielmo Faggioli; Nicola Ferro. Experimental IR Meets Multilinguality, Multimodality, and Interaction, 12880, Springer International Publishing, pp.371-393, 2021, Lecture Notes in Computer Science, ⟨10.1007/978-3-030-85251-1_24⟩ (2021)
BASE
Show details
20
Simulating reading mistakes for child speech Transformer-based phone recognition
In: Annual Conference of the International Speech Communication Association (INTERSPEECH) ; https://hal.archives-ouvertes.fr/hal-03257870 ; Annual Conference of the International Speech Communication Association (INTERSPEECH), Aug 2021, Brno, Czech Republic (2021)
BASE
Show details

Page: 1 2 3 4 5...60

Catalogues
47
30
5
0
4
0
0
Bibliographies
50
0
0
0
0
0
85
0
3
Linked Open Data catalogues
0
Online resources
1
0
0
0
Open access documents
987
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern