2 |
Probing Image-Language Transformers for Verb Understanding ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Probing Image-Language Transformers for Verb Understanding ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Improving LSTM-based Video Description with Linguistic Knowledge Mined from Text ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|