1 |
Analysis of Language Change in Collaborative Instruction Following ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Analysis of Language Change in Collaborative Instruction Following
|
|
|
|
In: Proceedings of the Society for Computation in Linguistics (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Analysis of Language Change in Collaborative Instruction Following ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Analysis of Language Change in Collaborative Instruction Following ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Evaluating Models' Local Decision Boundaries via Contrast Sets ...
|
|
Gardner, Matt; Artzi, Yoav; Basmova, Victoria; Berant, Jonathan; Bogin, Ben; Chen, Sihao; Dasigi, Pradeep; Dua, Dheeru; Elazar, Yanai; Gottumukkala, Ananth; Gupta, Nitish; Hajishirzi, Hanna; Ilharco, Gabriel; Khashabi, Daniel; Lin, Kevin; Liu, Jiangming; Liu, Nelson F.; Mulcaire, Phoebe; Ning, Qiang; Singh, Sameer; Smith, Noah A.; Subramanian, Sanjay; Tsarfaty, Reut; Wallace, Eric; Zhang, Ally; Zhou, Ben. - : arXiv, 2020
|
|
Abstract:
Standard test sets for supervised learning evaluate in-distribution generalization. Unfortunately, when a dataset has systematic gaps (e.g., annotation artifacts), these evaluations are misleading: a model can learn simple decision rules that perform well on the test set but do not capture a dataset's intended capabilities. We propose a new annotation paradigm for NLP that helps to close systematic gaps in the test data. In particular, after a dataset is constructed, we recommend that the dataset authors manually perturb the test instances in small but meaningful ways that (typically) change the gold label, creating contrast sets. Contrast sets provide a local view of a model's decision boundary, which can be used to more accurately evaluate a model's true linguistic capabilities. We demonstrate the efficacy of contrast sets by creating them for 10 diverse NLP datasets (e.g., DROP reading comprehension, UD parsing, IMDb sentiment analysis). Although our contrast sets are not explicitly adversarial, model ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2004.02709 https://dx.doi.org/10.48550/arxiv.2004.02709
|
|
BASE
|
|
Hide details
|
|
7 |
What is Learned in Visually Grounded Neural Syntax Acquisition ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
A Corpus for Reasoning About Natural Language Grounded in Photographs ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Touchdown: Natural Language Navigation and Spatial Reasoning in Visual Street Environments ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Mapping Instructions and Visual Observations to Actions with Reinforcement Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Learning to Automatically Solve Algebra Word Problems
|
|
|
|
In: MIT web domain (2014)
|
|
BASE
|
|
Show details
|
|
|
|