1 |
Alternated Training with Synthetic and Authentic Data for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
CPM-2: Large-scale Cost-effective Pre-trained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator ...
|
|
|
|
Abstract:
Interactive robots navigating photo-realistic environments need to be trained to effectively leverage and handle the dynamic nature of dialogue in addition to the challenges underlying vision-and-language navigation (VLN). In this paper, we present VISITRON, a multi-modal Transformer-based navigator better suited to the interactive regime inherent to Cooperative Vision-and-Dialog Navigation (CVDN). VISITRON is trained to: i) identify and associate object-level concepts and semantics between the environment and dialogue history, ii) identify when to interact vs. navigate via imitation learning of a binary classification head. We perform extensive pre-training and fine-tuning ablations with VISITRON to gain empirical insights and improve performance on CVDN. VISITRON's ability to identify when to interact leads to a natural generalization of the game-play mode introduced by Roman et al. (arXiv:2005.00728) for enabling the use of such models in different environments. VISITRON is competitive with models on the ... : Accepted at Findings of the Annual Meeting of the Association for Computational Linguistics (ACL) 2022, previous version accepted at Visually Grounded Interaction and Language (ViGIL) Workshop at NAACL 2021 ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences; I.2.9; Machine Learning cs.LG; Robotics cs.RO
|
|
URL: https://dx.doi.org/10.48550/arxiv.2105.11589 https://arxiv.org/abs/2105.11589
|
|
BASE
|
|
Hide details
|
|
4 |
Assessing Multilingual Fairness in Pre-trained Multimodal Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Dialog{S}um: {A} Real-Life Scenario Dialogue Summarization Dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Transfer Learning for Sequence Generation: from Single-source to Multi-source ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Learning to Selectively Learn for Weakly-supervised Paraphrase Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Analyzing the Limits of Self-Supervision in Handling Bias in Language ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Statistically significant detection of semantic shifts using contextual word embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Leveraging Word-Formation Knowledge for Chinese Word Sense Disambiguation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Tapping into non-English-language science for the conservation of global biodiversity. ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Tapping into non-English-language science for the conservation of global biodiversity. ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|