1 |
EMScore: Evaluating Video Captioning via Coarse-Grained and Fine-Grained Embedding Matching ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Learning to Discretely Compose Reasoning Module Networks for Video Captioning ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Knowledge-guided Pairwise Reconstruction Network for Weakly Supervised Referring Expression Grounding ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Adaptive Reconstruction Network for Weakly Supervised Referring Expression Grounding ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Learning to Assemble Neural Module Tree Networks for Visual Grounding ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|