2 |
Effects of Parameter Norm Growth During Transformer Training: Inductive Bias from Gradient Descent ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Softmax Tree: An Accurate, Fast Classifier When the Number of Classes Is Large ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
GOLD: Improving Out-of-Scope Detection in Dialogues using Data Augmentation ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
RuleBERT: Teaching Soft Rules to Pre-Trained Language Models ...
|
|
|
|
Abstract:
Anthology paper link: https://aclanthology.org/2021.emnlp-main.110/ Abstract: While pre-trained language models (PLMs) are the go-to solution to tackle many natural language processing problems, they are still very limited in their ability to capture and to use common-sense knowledge. In fact, even if information is available in the form of approximate (soft) logical rules, it is not clear how to transfer it to a PLM in order to improve its performance for deductive reasoning tasks. Here, we aim to bridge this gap by teaching PLMs how to reason with soft Horn rules. We introduce a classification task where, given facts and soft rules, the PLM should return a prediction with a probability for a given hypothesis. We release the first dataset for this task, and we propose a revised loss function that enables the PLM to learn how to predict precise probabilities for the task. Our evaluation results show that the resulting fine-tuned models achieve very high performance, even on logical rules that were unseen at ...
|
|
Keyword:
Language Models; Natural Language Processing; Semantic Evaluation; Sociolinguistics
|
|
URL: https://dx.doi.org/10.48448/j90c-eg06 https://underline.io/lecture/37622-rulebert-teaching-soft-rules-to-pre-trained-language-models
|
|
BASE
|
|
Hide details
|
|
7 |
Implicit Premise Generation with Discourse-aware Commonsense Knowledge Models ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
On the Challenges of Evaluating Compositional Explanations in Multi-Hop Inference: Relevance, Completeness, and Expert Ratings ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Enhanced Language Representation with Label Knowledge for Span Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
The Devil is in the Detail: Simple Tricks Improve Systematic Generalization of Transformers ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
VeeAlign: Multifaceted Context Representation Using Dual Attention for Ontology Alignment ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Shortcutted Commonsense: Data Spuriousness in Deep Learning of Commonsense Reasoning ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
On Classifying whether Two Texts are on the Same Side of an Argument ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Causal Direction of Data Collection Matters: Implications of Causal and Anticausal Learning for NLP ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
MTAdam: Automatic Balancing of Multiple Training Loss Terms ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Types of Out-of-Distribution Texts and How to Detect Them ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Asking It All: Generating Contextualized Questions for any Semantic Role ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Competency Problems: On Finding and Removing Artifacts in Language Data ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|