1 |
Probing for the Usage of Grammatical Number ...
|
|
|
|
Abstract:
A central quest of probing is to uncover how pre-trained models encode a linguistic property within their representations. An encoding, however, might be spurious-i.e., the model might not rely on it when making predictions. In this paper, we try to find encodings that the model actually uses, introducing a usage-based probing setup. We first choose a behavioral task which cannot be solved without using the linguistic property. Then, we attempt to remove the property by intervening on the model's representations. We contend that, if an encoding is used by the model, its removal should harm the performance on the chosen behavioral task. As a case study, we focus on how BERT encodes grammatical number, and on how it uses this encoding to solve the number agreement task. Experimentally, we find that BERT relies on a linear encoding of grammatical number to produce the correct behavioral output. We also find that BERT uses a separate encoding of grammatical number for nouns and verbs. Finally, we identify in ... : ACL 2022 (Main Conference) The discussion section had been inadvertently removed before the article was published on arxiv ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2204.08831 https://dx.doi.org/10.48550/arxiv.2204.08831
|
|
BASE
|
|
Hide details
|
|
2 |
Does BERT really agree ? Fine-grained Analysis of Lexical Dependence on a Syntactic Task ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Automatic enjambment detection as a new source of evidence in Spanish versification
|
|
|
|
In: Plotting Poetry: On Mechanically Enhanced Reading ; https://hal.archives-ouvertes.fr/hal-03255481 ; Bories, Anne-Sophie; Purnelle, Gérald; Marchal, Hugues. Plotting Poetry: On Mechanically Enhanced Reading, Presses Universitaires de Liège, 2021, 978-2-87562-280-8 ; http://www.presses.uliege.be/ (2021)
|
|
BASE
|
|
Show details
|
|
4 |
The Corpus for Idiolectal Research (CIDRE)
|
|
|
|
In: European Association of Digital Humanities Conference (EADH 2021) ; https://hal.archives-ouvertes.fr/hal-03353520 ; European Association of Digital Humanities Conference (EADH 2021), Sep 2021, Krasnoyarsk, Russia (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Evaluating Hierarchical Clustering Methods for Corpora with Chronological Order
|
|
|
|
In: EADH2021: Interdisciplinary Perspectives on Data. Second International Conference of the European Association for Digital Humanities ; https://hal.archives-ouvertes.fr/hal-03341803 ; EADH2021: Interdisciplinary Perspectives on Data. Second International Conference of the European Association for Digital Humanities, EADH, Sep 2021, Krasnoyarsk, Russia ; https://eadh2020-2021.org/ (2021)
|
|
BASE
|
|
Show details
|
|
6 |
The Corpus for Idiolectal Research (CIDRE)
|
|
|
|
In: EISSN: 2059-481X ; Journal of Open Humanities Data ; https://hal.archives-ouvertes.fr/hal-03310451 ; Journal of Open Humanities Data, Ubiquity Press, 2021, 7, pp.15. ⟨10.5334/johd.42⟩ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Text Zoning of Theater Reviews: How Different are Journalistic from Blogger Reviews?
|
|
|
|
In: Workshop on Natural Language Processing for Digital Humanities ; https://hal.archives-ouvertes.fr/hal-03498270 ; Workshop on Natural Language Processing for Digital Humanities, Dec 2021, Sichar, India ; https://rootroo.com/downloads/nlp4dh_proceedings_draft.pdf (2021)
|
|
BASE
|
|
Show details
|
|
11 |
The Corpus for Idiolectal Research (CIDRE)
|
|
|
|
In: Journal of Open Humanities Data; Vol 7 (2021); 15 ; 2059-481X (2021)
|
|
BASE
|
|
Show details
|
|
14 |
ACCOLÉ : Annotation Collaborative d'erreurs de traduction pour COrpus aLignÉs – Nouvelles fonctionnalités
|
|
|
|
In: Actes des 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT). ; 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT) ; https://hal.archives-ouvertes.fr/hal-03047150 ; 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT), 2020, Montrouge, France. pp.1-8 (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Glossary: Introduction to the Digital Humanities ; Glossaire : Introduction aux humanités numériques
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-02410396 ; 2020 (2020)
|
|
BASE
|
|
Show details
|
|
16 |
Multi-SimLex: A Large-Scale Evaluation of Multilingual and Cross-Lingual Lexical Semantic Similarity
|
|
|
|
In: ISSN: 0891-2017 ; EISSN: 1530-9312 ; Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02975786 ; Computational Linguistics, Massachusetts Institute of Technology Press (MIT Press), 2020, 46 (4), pp.847-897 ; https://direct.mit.edu/coli/article/46/4/847/97326/Multi-SimLex-A-Large-Scale-Evaluation-of (2020)
|
|
BASE
|
|
Show details
|
|
17 |
Semi-Supervised Learning on Meta Structure: Multi-Task Tagging and Parsing in Low-Resource Scenarios
|
|
|
|
In: Conference of the Association for the Advancement of Artificial Intelligence ; https://hal.archives-ouvertes.fr/hal-02895835 ; Conference of the Association for the Advancement of Artificial Intelligence, Association for the Advancement of Artificial Intelligence, Feb 2020, New York, United States ; https://aaai.org/Conferences/AAAI-20/ (2020)
|
|
BASE
|
|
Show details
|
|
18 |
Lexical encoding of multiword expressions in XMG
|
|
|
|
In: Actes des 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT). ; 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT) ; https://hal.archives-ouvertes.fr/hal-03047145 ; 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT), Dec 2020, Montrouge, France. pp.60-63 (2020)
|
|
BASE
|
|
Show details
|
|
19 |
Classification des catégories grammaticales sur deux corpus longitudinaux d’enfants
|
|
|
|
In: Actes des 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT). ; 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT) ; https://hal.archives-ouvertes.fr/hal-03047149 ; 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT), 2020, Montrouge, France. pp.23-33 (2020)
|
|
BASE
|
|
Show details
|
|
20 |
Longform recordings : Opportunities and challenges ; Enregistrements de longue durée: Opportunités et défis
|
|
|
|
In: Actes des 2èmes journées scientifiques du Groupement de Recherche Linguistique Informatique Formelle et de Terrain (LIFT). ; LIFT 2020 - 2èmes journées scientifiques du Groupement de Recherche "Linguistique informatique, formelle et de terrain" ; https://hal.archives-ouvertes.fr/hal-03047153 ; LIFT 2020 - 2èmes journées scientifiques du Groupement de Recherche "Linguistique informatique, formelle et de terrain", Dec 2020, Montrouge / Virtual, France. pp.64-71 (2020)
|
|
BASE
|
|
Show details
|
|
|
|