DE eng

Search in the Catalogues and Directories

Hits 1 – 2 of 2

1
Language-Driven Region Pointer Advancement for Controllable Image Captioning ...
Abstract: Controllable Image Captioning is a recent sub-field in the multi-modal task of Image Captioning wherein constraints are placed on which regions in an image should be described in the generated natural language caption. This puts a stronger focus on producing more detailed descriptions, and opens the door for more end-user control over results. A vital component of the Controllable Image Captioning architecture is the mechanism that decides the timing of attending to each region through the advancement of a region pointer. In this paper, we propose a novel method for predicting the timing of region pointer advancement by treating the advancement step as a natural part of the language structure via a NEXT-token, motivated by a strong correlation to the sentence structure in the training data. We find that our timing agrees with the ground-truth timing in the Flickr30k Entities test data with a precision of 86.55% and a recall of 97.92%. Our model implementing this technique improves the state-of-the-art on ... : Accepted to COLING 2020 ...
Keyword: 68T07, 68T45, 68T50; Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences; I.2.7; I.2.10; I.5.1; Machine Learning cs.LG; Neural and Evolutionary Computing cs.NE
URL: https://arxiv.org/abs/2011.14901
https://dx.doi.org/10.48550/arxiv.2011.14901
BASE
Hide details
2
Language-Driven Region Pointer Advancement for Controllable Image Captioning
In: Conference papers (2020)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern