Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5

Hits 1 – 20 of 84

1	ViQuAE, a Dataset for Knowledge-based Visual Question Answering about Named Entities
	Lerner, Paul; Ferret, Olivier; Guinaudeau, Camille; Le Borgne, Hervé; Besançon, Romaric; Moreno, José G.; Lovón Melgarejo, Jesús
	In: ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR ’22) ; https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 ; 2022 (2022)
	Abstract: International audience ; Whether to retrieve, answer, translate, or reason, multimodality opens up new challenges and perspectives. In this context, we are interested in answering questions about named entities grounded in a visual context using a Knowledge Base (KB). To benchmark this task, called KVQAE (Knowledge-based Visual Question Answering about named Entities), we provide ViQuAE, a dataset of 3.7K questions paired with images. This is the first KVQAE dataset to cover a wide range of entity types (e.g. persons, landmarks, and products). The dataset is annotated using a semi-automatic method. We also propose a KB composed of 1.5M Wikipedia articles paired with images. To set a baseline on the benchmark, we address KVQAE as a two-stage problem: Information Retrieval and Reading Comprehension, with both zero-and few-shot learning methods. The experiments empirically demonstrate the difficulty of the task, especially when questions are not about persons. This work paves the way for better multimodal entity representations and question answering. The dataset, KB, code, and semi-automatic annotation pipeline are freely available at https://github.com/PaulLerner/ViQuAE.
	Keyword: [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-MM]Computer Science [cs]/Multimedia [cs.MM]; dataset; knowledge-based visual question answering; multimodal
	URL: https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618 https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618/document https://doi.org/10.1145/3477495.3531753 https://hal-universite-paris-saclay.archives-ouvertes.fr/hal-03650618/file/lerner_sigir_2022_camera.pdf
	BASE
	Hide details

2	Unsupervised quantification of entity consistency between photos and text in real-world news ...
	Müller-Budack, Eric. - : Hannover : Institutionelles Repositorium der Leibniz Universität Hannover, 2022
	BASE
	Show details

3	Supporting an effective review of telecollaboration for second language learning by visualising the participation and engagement at Dublin City University
	Lee, Hyowon; Scriney, Michael; Dey-Plissonneau, Aparajita...
	In: Lee, Hyowon orcid:0000-0003-4395-7702 , Scriney, Michael orcid:0000-0001-6813-2630 , Dey-Plissonneau, Aparajita and Smeaton, Alan orcid:0000-0003-1028-8389 (2021) Supporting an effective review of telecollaboration for second language learning by visualising the participation and engagement at Dublin City University. In: Virtual Exchange in Higher Education: Charting the Irish Experience, 17 Sept 2021, Online vs MS Teams. (2021)
	BASE
	Show details

4	Sign and Search: Sign Search Functionality for Sign Language Lexica ...
	Fragkiadakis, Manolis; van der Putten, Peter. - : arXiv, 2021
	BASE
	Show details

5	Unsupervised Cross-Modal Audio Representation Learning from Unstructured Multilingual Text ...
	Schindler, Alexander; Gordea, Sergiu; Knees, Peter. - : arXiv, 2020
	BASE
	Show details

6	Recommending Themes for Ad Creative Design via Visual-Linguistic Representations ...
	Zhou, Yichao; Mishra, Shaunak; Verma, Manisha. - : arXiv, 2020
	BASE
	Show details

7	Fuzzy Logic Based Integration of Web Contextual Linguistic Structures for Enriching Conceptual Visual Representations ...
	Belkhatir, M.. - : arXiv, 2020
	BASE
	Show details

8	MusicTM-Dataset for Joint Representation Learning among Sheet Music, Lyrics, and Musical Audio ...
	Zeng, Donghuo; Yu, Yi; Oyama, Keizo. - : arXiv, 2020
	BASE
	Show details

9	Utilization of multimodal interaction signals for automatic summarisation of academic presentations
	Curtis, Keith. - : Dublin City University. School of Computing, 2018
	In: Curtis, Keith (2018) Utilization of multimodal interaction signals for automatic summarisation of academic presentations. PhD thesis, Dublin City University. (2018)
	BASE
	Show details

10	Multimodal Machine Translation with Reinforcement Learning ...
	Qian, Xin; Zhong, Ziyi; Zhou, Jieli. - : arXiv, 2018
	BASE
	Show details

11	ImproteK: introducing scenarios into human-computer music improvisation
	Nika, Jérôme; Chemillier, Marc; Assayag, Gérard
	In: ACM Computers in Entertainment ; https://hal.archives-ouvertes.fr/hal-01380163 ; ACM Computers in Entertainment, 2017, ⟨10.1145/3022635⟩ (2017)
	BASE
	Show details

12	Multimodal Person Discovery in Broadcast TV: lessons learned from MediaEval 2015
	Poignant, Johann; Bredin, Hervé; Barras, Claude
	In: ISSN: 1380-7501 ; EISSN: 1573-7721 ; Multimedia Tools and Applications ; https://hal.archives-ouvertes.fr/hal-01690581 ; Multimedia Tools and Applications, Springer Verlag, 2017, 76 (21), pp.22547 - 22567. ⟨10.1007/s11042-017-4730-x⟩ (2017)
	BASE
	Show details

13	Enabling Embodied Analogies in Intelligent Music Systems ...
	Paolizzo, Fabio. - : arXiv, 2017
	BASE
	Show details

14	Narrative Smoothing: Dynamic Conversational Network for the Analysis of TV Series Plots
	Bost, Xavier; Labatut, Vincent; Gueye, Serigne...
	In: DyNo: 2nd International Workshop on Dynamics in Networks, in conjunction with the 2016 IEEE/ACM International Conference ASONAM ; https://hal.archives-ouvertes.fr/hal-01276708 ; DyNo: 2nd International Workshop on Dynamics in Networks, in conjunction with the 2016 IEEE/ACM International Conference ASONAM, Aug 2016, San Francisco, United States. pp.1111-1118, ⟨10.1109/ASONAM.2016.7752379⟩ (2016)
	BASE
	Show details

15	Multilingual Visual Sentiment Concept Matching ...
	Pappas, Nikolaos; Redi, Miriam; Topkara, Mercan. - : arXiv, 2016
	BASE
	Show details

16	Hierarchical topic structuring: from dense segmentation to topically focused fragments via burst analysis
	Simon, Anca; Sébillot, Pascale; Gravier, Guillaume
	In: Recent Advances on Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-01186443 ; Recent Advances on Natural Language Processing, 2015, Hissar, Bulgaria (2015)
	BASE
	Show details

17	Temporal re-scoring vs. temporal descriptors for semantic indexing of videos
	Hamadi, Abdelkader; Mulhem, Philippe; Quénot, Georges
	In: 13th International Workshop on Content-Based Multimedia Indexing (CBMI) ; https://hal.archives-ouvertes.fr/hal-01230719 ; 13th International Workshop on Content-Based Multimedia Indexing (CBMI), Jun 2015, Prague, Czech Republic. pp.1-4, ⟨10.1109/CBMI.2015.7153626⟩ (2015)
	BASE
	Show details

18	Visual Affect Around the World: A Large-scale Multilingual Visual Sentiment Ontology ...
	Jou, Brendan; Chen, Tao; Pappas, Nikolaos. - : arXiv, 2015
	BASE
	Show details

19	Novel perspectives and approaches to video summarization
	Guan, Genliang. - : The University of Sydney, 2015. : Faculty of Engineering and Information Technologies, School of Information Technologies, 2015
	BASE
	Show details

20	Planning Human-Computer Improvisation
	Nika, Jérôme; Echeveste, José-Manuel; Chemillier, Marc...
	In: International Computer Music Conference ; https://hal.archives-ouvertes.fr/hal-01053834 ; International Computer Music Conference, Sep 2014, Athens, Greece ; http://icmc14-smc14.net (2014)
	BASE
	Show details

Page: 1 2 3 4 5

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern