2 |
What happens if you treat ordinal ratings as interval data? Human evaluations in {NLP} are even more under-powered than you think
|
|
|
|
BASE
|
|
Show details
|
|
3 |
OTTers: One-turn Topic Transitions for Open-Domain Dialogue ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
What happens if you treat ordinal ratings as interval data? Human evaluations in NLP are even more under-powered than you think ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Twenty Years of Confusion in Human Evaluation: NLG Needs Evaluation Sheets and Standardised Definition
|
|
|
|
BASE
|
|
Show details
|
|
7 |
How speakers adapt object descriptions to listeners under load
|
|
|
|
BASE
|
|
Show details
|
|
8 |
G-TUNA: a corpus of referring expressions in German, including duration information
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Psycholinguistic Models of Sentence Processing Improve Sentence Readability Ranking ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Inducing Clause-Combining Rules: A Case Study with the SPaRKy Restaurant Corpus
|
|
|
|
BASE
|
|
Show details
|
|
|
|