DE eng

Search in the Catalogues and Directories

Hits 1 – 12 of 12

1
Characterizing Idioms: Conventionality and Contingency ...
BASE
Show details
2
Optimizing Deeper Transformers on Small Datasets ...
Abstract: Read paper: https://www.aclanthology.org/2021.acl-long.163 Abstract: It is a common belief that training deep transformers from scratch requires large datasets. Consequently, for small datasets, people usually use shallow and simple additional layers on top of pre-trained models during fine-tuning. This work shows that this does not always need to be the case: with proper initialization and optimization, the benefits of very deep transformers can carry over to challenging tasks with small datasets, including Text-to-SQL semantic parsing and logical reading comprehension. In particular, we successfully train 48 layers of transformers, comprising 24 fine-tuned layers from pre-trained RoBERTa and 24 relation-aware layers trained from scratch. With fewer training steps and no task-specific pre-training, we obtain the state of the art performance on the challenging cross-domain Text-to-SQL parsing benchmark Spider. We achieve this by deriving a novel Data dependent Transformer Fixed-update initialization scheme ...
Keyword: Computational Linguistics; Condensed Matter Physics; Deep Learning; Electromagnetism; FOS Physical sciences; Information and Knowledge Engineering; Neural Network; Semantics
URL: https://dx.doi.org/10.48448/ehsy-3055
https://underline.io/lecture/25482-optimizing-deeper-transformers-on-small-datasets
BASE
Hide details
3
Textual Time Travel: A Temporally Informed Approach to Theory of Mind ...
BASE
Show details
4
Modeling Event Plausibility with Consistent Conceptual Abstraction ...
BASE
Show details
5
ADEPT: An Adjective-Dependent Plausibility Task ...
BASE
Show details
6
An Analysis of Dataset Overlap on Winograd-Style Tasks ...
BASE
Show details
7
On the Systematicity of Probing Contextualized Word Representations: The Case of Hypernymy in BERT ...
BASE
Show details
8
Learning Efficient Task-Specific Meta-Embeddings with Word Prisms ...
BASE
Show details
9
Resolving Event Coreference with Supervised Representation Learning and Clustering-Oriented Regularization ...
BASE
Show details
10
Leveraging Lexical Resources for Learning Entity Embeddings in Multi-Relational Data ...
BASE
Show details
11
Distributional Semantics for Robust Automatic Summarization
BASE
Show details
12
Entity-based local coherence modelling using topological fields
In: Association for Computational Linguistics. Proceedings of the conference. - Stroudsburg, Penn. : ACL 48 (2010) 1, 186-195
BLLDB
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
1
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
11
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern