1 |
Evaluating Explanations: How Much Do Explanations from the Teacher Aid Students?
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 10, Pp 359-375 (2022) (2022)
|
|
BASE
|
|
Show details
|
|
2 |
WikiAsp: A Dataset for Multi-domain Aspect-based Summarization
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 211-225 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
3 |
Lexically Aware Semi-Supervised Learning for OCR Post-Correction
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1285-1302 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
4 |
How Can We Know When Language Models Know? On the Calibration of Language Models for Question Answering
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 962-977 (2021) (2021)
|
|
Abstract:
AbstractRecent works have shown that language models (LM) capture different types of knowledge regarding facts or common sense. However, because no model is perfect, they still fail to provide appropriate answers in many cases. In this paper, we ask the question, “How can we know when language models know, with confidence, the answer to a particular query?” We examine this question from the point of view of calibration, the property of a probabilistic model’s predicted probabilities actually being well correlated with the probabilities of correctness. We examine three strong generative models—T5, BART, and GPT-2—and study whether their probabilities on QA tasks are well calibrated, finding the answer is a relatively emphatic no. We then examine methods to calibrate such models to make their confidence scores correlate better with the likelihood of correctness through fine-tuning, post-hoc probability modification, or adjustment of the predicted outputs or inputs. Experiments on a diverse range of datasets demonstrate the effectiveness of our methods. We also perform analysis to study the strengths and limitations of these methods, shedding light on further improvements that may be made in methods for calibrating LMs. We have released the code at https://github.com/jzbjyb/lm-calibration.
|
|
Keyword:
Computational linguistics. Natural language processing; P98-98.5
|
|
URL: https://doaj.org/article/2dc285d060674cb7be1481e110573a1f https://doi.org/10.1162/tacl_a_00407
|
|
BASE
|
|
Hide details
|
|
5 |
Reducing Confusion in Active Learning for Part-Of-Speech Tagging
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1-16 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
6 |
MasakhaNER: Named Entity Recognition for African Languages
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 9, Pp 1116-1131 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Collection of a simultaneous translation corpus for comparative analysis
|
|
|
|
In: http://www.lrec-conf.org/proceedings/lrec2014/pdf/162_Paper.pdf (2014)
|
|
BASE
|
|
Show details
|
|
8 |
A monotonic statistical machine translation approach to speaking style transformation
|
|
|
|
In: http://www.phontron.com/paper/neubig12csl.pdf (2012)
|
|
BASE
|
|
Show details
|
|
9 |
An Event-Related Brain Potential Study on the Impact of Speech Recognition Errors
|
|
|
|
In: http://www.phontron.com/paper/sakti14apsipa.pdf
|
|
BASE
|
|
Show details
|
|
10 |
The NAIST English Speech Recognition System for IWSLT 2015
|
|
|
|
In: http://www.phontron.com/paper/heck15iwslt.pdf
|
|
BASE
|
|
Show details
|
|
11 |
The NAIST English Speech Recognition System for IWSLT 2015
|
|
|
|
In: http://workshop2015.iwslt.org/downloads/IWSLT_2015_EP_8.pdf
|
|
BASE
|
|
Show details
|
|
12 |
Rule-based Syntactic Preprocessing for Syntax-based Machine Translation
|
|
|
|
In: http://www.aclweb.org/anthology/W/W14/W14-4004.pdf
|
|
BASE
|
|
Show details
|
|
13 |
Unsupervised Learning of Lexical Information for Language Processing Systems
|
|
|
|
In: http://www.phontron.com/paper/neubig-thesis.pdf
|
|
BASE
|
|
Show details
|
|
14 |
Data-Driven Generation of Text Balloons based on Linguistic and Acoustic Features of a Comics-Anime Corpus
|
|
|
|
In: http://www.phontron.com/paper/matsumiya14interspeech.pdf
|
|
BASE
|
|
Show details
|
|
15 |
INTERSPEECH 2010 Learning a Language Model from Continuous Speech
|
|
|
|
In: http://www.ar.media.kyoto-u.ac.jp/EN/bib/intl/NEU-INTERSP10.pdf
|
|
BASE
|
|
Show details
|
|
16 |
Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015
|
|
|
|
In: http://www.phontron.com/paper/neubig15wat.pdf
|
|
BASE
|
|
Show details
|
|
17 |
Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT2015
|
|
|
|
In: http://wing.comp.nus.edu.sg/%7Eantho/W/W15/W15-5003.pdf
|
|
BASE
|
|
Show details
|
|
18 |
Substring-based Machine Translation
|
|
|
|
In: http://www.phontron.com/paper/neubig13mtj.pdf
|
|
BASE
|
|
Show details
|
|
19 |
TOWARDS LANGUAGE PRESERVATION: PRELIMINARY COLLECTION AND VOWEL ANALYSIS OF INDONESIAN ETHNIC SPEECH DATA
|
|
|
|
In: http://www.phontron.com/paper/sani12ococosda.pdf
|
|
BASE
|
|
Show details
|
|
20 |
TOWARDS LANGUAGE PRESERVATION: PRELIMINARY COLLECTION AND VOWEL ANALYSIS OF INDONESIAN ETHNIC SPEECH DATA
|
|
|
|
In: http://spalab.naist.jp/~tomoki/Tomoki/Conferences/OCOCOSDA2012_LangPreserv.pdf
|
|
BASE
|
|
Show details
|
|
|
|