Page: 1 2 3 4 5 6 7 8 9 10... 52
101 |
Perspective-taking and Pragmatics for Generating Empathetic Responses Focused on Emotion Causes ...
|
|
|
|
BASE
|
|
Show details
|
|
102 |
Wikily Supervised Neural Translation Tailored to Cross-Lingual Tasks ...
|
|
|
|
BASE
|
|
Show details
|
|
103 |
Vision Guided Generative Pre-trained Language Models for Multimodal Abstractive Summarization ...
|
|
|
|
BASE
|
|
Show details
|
|
104 |
Sorting through the noise: Testing robustness of information processing in pre-trained language models ...
|
|
|
|
BASE
|
|
Show details
|
|
105 |
Building the Directed Semantic Graph for Coherent Long Text Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
106 |
Detect and Classify – Joint Span Detection and Classification for Health Outcomes ...
|
|
|
|
BASE
|
|
Show details
|
|
107 |
Cross-Attention is All You Need: Adapting Pretrained Transformers for Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
108 |
Evaluation of Summarization Systems across Gender, Age, and Race ...
|
|
|
|
BASE
|
|
Show details
|
|
109 |
A Language Model-based Generative Classifier for Sentence-level Discourse Parsing ...
|
|
|
|
BASE
|
|
Show details
|
|
110 |
Controllable Neural Dialogue Summarization with Personal Named Entity Planning ...
|
|
|
|
BASE
|
|
Show details
|
|
112 |
Graphine: A Dataset for Graph-aware Terminology Definition Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
113 |
CSDS: A Fine-Grained Chinese Dataset for Customer Service Dialogue Summarization ...
|
|
|
|
BASE
|
|
Show details
|
|
114 |
Connecting Attributions and QA Model Behavior on Realistic Counterfactuals ...
|
|
|
|
BASE
|
|
Show details
|
|
115 |
Provable Limitations of Acquiring Meaning from Ungrounded Form: What will Future Language Models Understand? ...
|
|
|
|
BASE
|
|
Show details
|
|
116 |
Generation and Extraction Combined Dialogue State Tracking with Hierarchical Ontology Integration ...
|
|
|
|
BASE
|
|
Show details
|
|
117 |
Error-Sensitive Evaluation for Ordinal Target Variables ...
|
|
|
|
Abstract:
Product reviews and satisfaction surveys seek customer feedback in the form of ranked scales. In these settings, widely used evaluation metrics including F1 and accuracy ignore the rank in the responses (e.g., ‘very likely’ is closer to ‘likely’ than ‘not at all’). In this paper, we hypothesize that the order of class values is important for evaluating classifiers on ordinal target variables and should not be disregarded. To test this hypothesis, we compared Multi-class Classification (MC) and Ordinal Regression (OR) by applying OR and MC to benchmark tasks involving ordinal target variables using the same underlying model architecture. Experimental results show that while MC outperformed OR for some datasets in accuracy and F1, OR is significantly better than MC for minimizing the error between prediction and target for all benchmarks, as revealed by error-sensitive metrics, e.g. mean-squared error (MSE) and Spearman correlation. Our findings motivate the need to establish consistent, error-sensitive ...
|
|
Keyword:
Computational Linguistics; Machine Learning; Natural Language Processing; Text Generation
|
|
URL: https://dx.doi.org/10.48448/pxm0-t717 https://underline.io/lecture/39289-error-sensitive-evaluation-for-ordinal-target-variables
|
|
BASE
|
|
Hide details
|
|
119 |
Data-to-text Generation by Splicing Together Nearest Neighbors ...
|
|
|
|
BASE
|
|
Show details
|
|
120 |
Natural Language Processing Meets Quantum Physics: A Survey and Categorization ...
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7 8 9 10... 52
|
|