Page: 1 2 3 4 5 6 7 8... 83
61 |
SummEval: Re-evaluating Summarization Evaluation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 391-409 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
62 |
Neural OCR Post-Hoc Correction of Historical Corpora
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 479-493 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
63 |
Experts, Errors, and Context: A Large-Scale Study of Human Evaluation for Machine Translation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1460-1474 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
64 |
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 962-977 (2021) (2021)
|
|
Abstract:
AbstractRecent works have shown that language models (LM) capture different types of knowledge regarding facts or common sense. However, because no model is perfect, they still fail to provide appropriate answers in many cases. In this paper, we ask the question, “How can we know when language models know, with confidence, the answer to a particular query?” We examine this question from the point of view of calibration, the property of a probabilistic model’s predicted probabilities actually being well correlated with the probabilities of correctness. We examine three strong generative models—T5, BART, and GPT-2—and study whether their probabilities on QA tasks are well calibrated, finding the answer is a relatively emphatic no. We then examine methods to calibrate such models to make their confidence scores correlate better with the likelihood of correctness through fine-tuning, post-hoc probability modification, or adjustment of the predicted outputs or inputs. Experiments on a diverse range of datasets demonstrate the effectiveness of our methods. We also perform analysis to study the strengths and limitations of these methods, shedding light on further improvements that may be made in methods for calibrating LMs. We have released the code at https://github.com/jzbjyb/lm-calibration.
|
|
Keyword:
Computational linguistics. Natural language processing; P98-98.5
|
|
URL: https://doaj.org/article/2dc285d060674cb7be1481e110573a1f https://doi.org/10.1162/tacl_a_00407
|
|
BASE
|
|
Hide details
|
|
65 |
Modeling Content and Context with Deep Relational Learning
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 100-119 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
66 |
A Statistical Analysis of Summarization Evaluation Metrics Using Resampling Methods
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1132-1146 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
67 |
Optimizing over subsequences generates context-sensitive languages
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 528-537 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
68 |
Morphology Matters: A Multilingual Language Modeling Analysis
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 261-276 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
69 |
Quantifying Social Biases in NLP: A Generalization and Empirical Comparison of Extrinsic Fairness Metrics
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1249-1267 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
70 |
Context-aware Adversarial Training for Name Regularity Bias in Named Entity Recognition
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 586-604 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
71 |
Continual Learning for Grounded Instruction Generation by Observing Human Following Behavior
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1303-1319 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
72 |
Deciphering Undersegmented Ancient Scripts Using Phonetic Prior
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 69-81 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
73 |
Sparse, Dense, and Attentional Representations for Text Retrieval
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 329-345 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
74 |
Decoupling the Role of Data, Attention, and Losses in Multimodal Transformers
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 570-585 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
75 |
Partially Supervised Named Entity Recognition via the Expected Entity Ratio Loss
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1320-1335 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
76 |
Formal Basis of a Language Universal
|
|
|
|
In: Computational Linguistics, Vol 47, Iss 1, Pp 9-42 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
77 |
Multimodal Pretraining Unmasked: A Meta-Analysis and a Unified Framework of Vision-and-Language BERTs
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 978-994 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
78 |
Revisiting Negation in Neural Machine Translation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 740-755 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
79 |
Quantifying Cognitive Factors in Lexical Decline
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1529-1545 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
80 |
Joint Universal Syntactic and Semantic Parsing
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 756-773 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8... 83
|
|