1 |
The GEM Benchmark: Natural Language Generation, its Evaluation and Metrics
|
|
|
|
In: Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021) ; https://hal.archives-ouvertes.fr/hal-03466171 ; Proceedings of the 1st Workshop on Natural Language Generation, Evaluation, and Metrics (GEM 2021), Aug 2021, Online, France. pp.96-120, ⟨10.18653/v1/2021.gem-1.10⟩ (2021)
|
|
BASE
|
|
Show details
|
|
2 |
BiSECT: Learning to Split and Rephrase Sentences with Bitexts ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Pre-train or Annotate? Domain Adaptation with a Constrained Budget ...
|
|
|
|
Abstract:
Anthology paper link: https://aclanthology.org/2021.emnlp-main.409/ Abstract: Recent work has demonstrated that pre-training in-domain language models can boost performance when adapting to a new domain. However, the costs associated with pre-training raise an important question: given a fixed budget, what steps should an NLP practitioner take to maximize performance? In this paper, we study domain adaptation under budget constraints, and approach it as a customer choice problem between data annotation and pre-training. Specifically, we measure the annotation cost of three procedural text datasets and the pre-training cost of three in-domain language models. Then we evaluate the utility of different combinations of pre-training and data annotation under varying budget constraints to assess which combination strategy works best. We find that, for small budgets, spending all funds on annotation leads to the best performance; once the budget becomes large enough, a combination of data annotation and in-domain ...
|
|
Keyword:
Computational Linguistics; Language Models; Machine Learning; Machine Learning and Data Mining; Natural Language Processing
|
|
URL: https://underline.io/lecture/37963-pre-train-or-annotatequestion-domain-adaptation-with-a-constrained-budget https://dx.doi.org/10.48448/z1gf-n855
|
|
BASE
|
|
Hide details
|
|
6 |
BiSECT: Learning to Split and Rephrase Sentences with Bitexts ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Sample data for "Design and Collection Challenges of Building an Academic Email Corpus for Linguistics and Computational Research" ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Sample data for "Design and Collection Challenges of Building an Academic Email Corpus for Linguistics and Computational Research" ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Controllable Text Simplification with Explicit Paraphrasing ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
The effectiveness of the problem-based learning in medical cell biology education: A systematic meta-analysis
|
|
|
|
In: Medicine (Baltimore) (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Controllable text simplification with explicit paraphrasing
|
|
|
|
BASE
|
|
Show details
|
|
12 |
An Empirical Study of Pre-trained Transformers for Arabic Information Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Controllable Text Simplification with Explicit Paraphrasing ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Interactive Grounded Language Acquisition and Generalization in a 2D World ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Interactive Language Acquisition with One-shot Visual Concept Learning through a Conversational Game ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Guided Feature Transformation (GFT): A Neural Language Grounding Module for Embodied Agents ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
A Word-Complexity Lexicon and A Neural Readability Ranking Model for Lexical Simplification ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
A Deep Compositional Framework for Human-like Language Acquisition in Virtual Environment ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
A Continuously Growing Dataset of Sentential Paraphrases ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Spectral Entropy Can Predict Changes of Working Memory Performance Reduced by Short-Time Training in the Delayed-Match-to-Sample Task
|
|
|
|
BASE
|
|
Show details
|
|
|
|