21 |
#PraCegoVer: A Large Dataset for Image Captioning in Portuguese
|
|
|
|
In: Data; Volume 7; Issue 2; Pages: 13 (2022)
|
|
Abstract:
Automatically describing images using natural sentences is essential to visually impaired people’s inclusion on the Internet. This problem is known as Image Captioning. There are many datasets in the literature, but most contain only English captions, whereas datasets with captions described in other languages are scarce. We introduce the #PraCegoVer, a multi-modal dataset with Portuguese captions based on posts from Instagram. It is the first large dataset for image captioning in Portuguese. In contrast to popular datasets, #PraCegoVer has only one reference per image, and both mean and variance of reference sentence length are significantly high, which makes our dataset challenging due to its linguistic aspect. We carry a detailed analysis to find the main classes and topics in our data. We compare #PraCegoVer to MS COCO dataset in terms of sentence length and word frequency. We hope that #PraCegoVer dataset encourages more works addressing the automatic generation of descriptions in Portuguese.
|
|
Keyword:
#PraCegoVer; image captioning; image captioning in Portuguese; image-to-text
|
|
URL: https://doi.org/10.3390/data7020013
|
|
BASE
|
|
Hide details
|
|
22 |
Data-Driven Analysis of European Portuguese Nasal Vowel Dynamics in Bilabial Contexts
|
|
|
|
In: Applied Sciences; Volume 12; Issue 9; Pages: 4601 (2022)
|
|
BASE
|
|
Show details
|
|
23 |
Exploring the Age Effects on European Portuguese Vowel Production: An Ultrasound Study
|
|
|
|
In: Applied Sciences; Volume 12; Issue 3; Pages: 1396 (2022)
|
|
BASE
|
|
Show details
|
|
24 |
An Empirical Comparison of Portuguese and Multilingual BERT Models for Auto-Classification of NCM Codes in International Trade
|
|
|
|
In: Big Data and Cognitive Computing; Volume 6; Issue 1; Pages: 8 (2022)
|
|
BASE
|
|
Show details
|
|
25 |
Illusion of Truth: Analysing and Classifying COVID-19 Fake News in Brazilian Portuguese Language
|
|
|
|
In: Big Data and Cognitive Computing; Volume 6; Issue 2; Pages: 36 (2022)
|
|
BASE
|
|
Show details
|
|
26 |
Intonational meaning in Spanish: PRESEEA Madrid corpus examples ...
|
|
|
|
BASE
|
|
Show details
|
|
28 |
Intersectional Silencing in the Archive: Salaria Kea and The Spanish Civil War
|
|
|
|
In: Languages, Literatures, and Linguistics (2022)
|
|
BASE
|
|
Show details
|
|
29 |
What's the matter with |U| and |I|? On nasal vowel diphthongization and element asymmetry
|
|
|
|
BASE
|
|
Show details
|
|
30 |
Comunidade surda idosa em Portugal: necessidades, preocupações e expectativas face ao futuro
|
|
|
|
BASE
|
|
Show details
|
|
31 |
Towards Portuguese Sign Language Identification Using Deep Learning
|
|
|
|
BASE
|
|
Show details
|
|
32 |
Los tejidos en las cantigas gallego-portuguesas: nuevas lecturas ; The fabrics of the Gallego-Portuguese cantigas: new readings
|
|
Vallín, Gema. - : Universitat d'Alacant. Departament de Filologia Catalana, 2022
|
|
BASE
|
|
Show details
|
|
33 |
Theoretical-practical reflections on the teaching of linguistic variation in Portuguese ; Reflexões teórico-práticas sobre o ensino de variação linguística em língua portuguesa
|
|
|
|
In: Domínios de Lingu@gem; Vol. 16 No. 2 (2022): Estudos sobre a relação entre gramática e língua: diversidade, unidade e métodos; 656-690 ; Domínios de Lingu@gem; v. 16 n. 2 (2022): Estudos sobre a relação entre gramática e língua: diversidade, unidade e métodos; 656-690 ; 1980-5799 (2022)
|
|
BASE
|
|
Show details
|
|
34 |
PFN-PT: a Framenet annotator for Portuguese ; Anotação semântica automática: um novo Framenet para o português
|
|
|
|
In: Domínios de Lingu@gem; Ahead of Print ; 1980-5799 (2022)
|
|
BASE
|
|
Show details
|
|
35 |
Conception of grammar and grammar teaching in Neves’ works ; Concepção de gramática e de ensino de gramática nas obras de Neves
|
|
|
|
In: Domínios de Lingu@gem; Vol. 16 No. 2 (2022): Estudos sobre a relação entre gramática e língua: diversidade, unidade e métodos; 794-842 ; Domínios de Lingu@gem; v. 16 n. 2 (2022): Estudos sobre a relação entre gramática e língua: diversidade, unidade e métodos; 794-842 ; 1980-5799 (2022)
|
|
BASE
|
|
Show details
|
|
36 |
Translations of the French comedy "Bienvenue chez les Ch’tis" into Portuguese
|
|
|
|
BASE
|
|
Show details
|
|
37 |
Africanismos léxicos en la historia lexicográfica de Uruguay: acepciones, usos y etimologías ; Lexical Africanisms in Uruguay’s Lexicographic History: Meanings, Uses and Etymologies
|
|
|
|
BASE
|
|
Show details
|
|
38 |
A acessibilidade para pessoas surdas nos canais televisivos generalistas em sinal aberto
|
|
|
|
BASE
|
|
Show details
|
|
40 |
A motivação social da haplologia variável no português de Porto Alegre ; The social motivation of variable haplology in Porto Alegre Portuguese
|
|
|
|
BASE
|
|
Show details
|
|
|
|