1 |
Perspectives and Prospects on Transformer Architecture for Cross-Modal Tasks with Language and Vision ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Transformer-Exclusive Cross-Modal Representation for Vision and Language ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Transformer-Exclusive Cross-Modal Representation for Vision and Language ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Mapping Images to Sentiment Adjective Noun Pairs with Factorized Neural Nets ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|