DE eng

Search in the Catalogues and Directories

Hits 1 – 12 of 12

1
What Does BERT Look At? An Analysis of BERT's Attention ...
Abstract: Large pre-trained neural networks such as BERT have had great recent success in NLP, motivating a growing body of research investigating what aspects of language they are able to learn from unlabeled data. Most recent analysis has focused on model outputs (e.g., language model surprisal) or internal vector representations (e.g., probing classifiers). Complementary to these works, we propose methods for analyzing the attention mechanisms of pre-trained models and apply them to BERT. BERT's attention heads exhibit patterns such as attending to delimiter tokens, specific positional offsets, or broadly attending over the whole sentence, with heads in the same layer often exhibiting similar behaviors. We further show that certain attention heads correspond well to linguistic notions of syntax and coreference. For example, we find heads that attend to the direct objects of verbs, determiners of nouns, objects of prepositions, and coreferent mentions with remarkably high accuracy. Lastly, we propose an ... : BlackBoxNLP 2019 ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/1906.04341
https://dx.doi.org/10.48550/arxiv.1906.04341
BASE
Hide details
2
Natural chunk-and-pass language processing: Just another joint source-channel coding model?
Clark, Kevin B.. - : Taylor & Francis, 2018
BASE
Show details
3
Improving Coreference Resolution by Learning Entity-Level Distributed Representations ...
BASE
Show details
4
Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora ...
BASE
Show details
5
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health ...
BASE
Show details
6
Inducing Domain-Specific Sentiment Lexicons from Unlabeled Corpora
BASE
Show details
7
Large-scale Analysis of Counseling Conversations: An Application of Natural Language Processing to Mental Health
BASE
Show details
8
Social biases determine spatiotemporal sparseness of ciliate mating heuristics
Clark, Kevin B.. - : Landes Bioscience, 2012
BASE
Show details
9
Metaphors we teach by: An embodied cognitive analysis of No Child Left Behind
In: Semiotica. - Berlin ; Boston : De Gruyter Mouton 161 (2006) 1-4, 265
OLC Linguistik
Show details
10
Metaphors we teach by : an embodied cognitive analysis of 'No Child Left Behind'
In: Semiotica. - Berlin ; Boston : De Gruyter Mouton 161 (2006) 1-4, 265-289
BLLDB
OLC Linguistik
Show details
11
Discourse deficits following right hemisphere damage in deaf signers
In: Brain & language. - Orlando, Fla. [u.a.] : Elsevier 66 (1999) 2, 233-248
BLLDB
Show details
12
On Pierre Klossowski and the problem of transcription
In: MLN. - Baltimore, Md. : Johns Hopkins Univ. Press 97 (1982) 4, 827-839
BLLDB
Show details

Catalogues
0
0
2
0
0
0
0
Bibliographies
3
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
8
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern