1 |
Discourse Analysis for Evaluating Coherence in Video Paragraph Captions ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
CrossVQA: Scalably Generating Benchmarks for Systematically Testing VQA Generalization ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Mind the Context: The Impact of Contextualization in Neural Module Networks for Grounding Visual Referring Expressions ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Words aren't enough, their order matters: On the Robustness of Grounding Visual Referring Expressions ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|