1 |
FLAVA: A Foundational Language And Vision Alignment Model ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Learning to Reason: End-to-End Module Networks for Visual Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Modeling Relationships in Referential Expressions with Compositional Modular Networks ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|