21 |
Linked Open Tafsir - Rekonstruktion der Entstehungsdynamik(en) des Korans mithilfe der Netzwerkmodellierung früher islamischer Überlieferungen ...
|
|
|
|
BASE
|
|
Show details
|
|
22 |
EMBEDDIA tools output example corpus of Estonian, Croatian and Latvian news articles 1.0
|
|
|
|
BASE
|
|
Show details
|
|
24 |
Threat assessment, sense making, and critical decision-making in police, military, ambulance, and fire services
|
|
|
|
In: Research outputs 2022 to 2026 (2022)
|
|
BASE
|
|
Show details
|
|
25 |
The influence of singing with text and a neutral syllable on Portuguese children´s vocal performance, song recognition, and use of singing voice
|
|
|
|
BASE
|
|
Show details
|
|
26 |
By the People Crowdsourcing Datasets from the Library of Congress
|
|
|
|
In: Journal of Open Humanities Data; Vol 8 (2022); 5 ; 2059-481X (2022)
|
|
BASE
|
|
Show details
|
|
27 |
Large-scale Bilingual Language-Image Contrastive Learning ...
|
|
|
|
Abstract:
This paper is a technical report to share our experience and findings building a Korean and English bilingual multimodal model. While many of the multimodal datasets focus on English and multilingual multimodal research uses machine-translated texts, employing such machine-translated texts is limited to describing unique expressions, cultural information, and proper noun in languages other than English. In this work, we collect 1.1 billion image-text pairs (708 million Korean and 476 million English) and train a bilingual multimodal model named KELIP. We introduce simple yet effective training schemes, including MAE pre-training and multi-crop augmentation. Extensive experiments demonstrate that a model trained with such training schemes shows competitive performance in both languages. Moreover, we discuss multimodal-related research questions: 1) strong augmentation-based methods can distract the model from learning proper multimodal relations; 2) training multimodal model without cross-lingual relation can ... : Accepted by ICLRW2022 ...
|
|
Keyword:
Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2203.14463 https://arxiv.org/abs/2203.14463
|
|
BASE
|
|
Hide details
|
|
28 |
Bridging Video-text Retrieval with Multiple Choice Questions ...
|
|
|
|
BASE
|
|
Show details
|
|
35 |
FiNER-139: A Financial Numeric Entity Recognition Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
38 |
FiNER-139: A Financial Numeric Entity Recognition Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
39 |
Review on multichannel emotion perception in ASD (Zhang et al., 2022) ...
|
|
|
|
BASE
|
|
Show details
|
|
40 |
Review on multichannel emotion perception in ASD (Zhang et al., 2022) ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|