2 |
Language Models Use Monotonicity to Assess NPI Licensing ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Causal Transformers Perform Below Chance on Recursive Nested Constructions, Unlike Humans ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Sparse Interventions in Language Models with Differentiable Masking ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Language Models Use Monotonicity to Assess NPI Licensing ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Generalising to German Plural Noun Classes, from the Perspective of a Recurrent Neural Network ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Masked Language Modeling and the Distributional Hypothesis: Order Word Matters Pre-training for Little ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Mechanisms for Handling Nested Dependencies in Neural-Network Language Models and Humans ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Assessing incrementality in sequence-to-sequence models ...
|
|
|
|
Abstract:
Since their inception, encoder-decoder models have successfully been applied to a wide array of problems in computational linguistics. The most recent successes are predominantly due to the use of different variations of attention mechanisms, but their cognitive plausibility is questionable. In particular, because past representations can be revisited at any point in time, attention-centric methods seem to lack an incentive to build up incrementally more informative representations of incoming sentences. This way of processing stands in stark contrast with the way in which humans are believed to process language: continuously and rapidly integrating new information as it is encountered. In this work, we propose three novel metrics to assess the behavior of RNNs with and without an attention mechanism and identify key differences in the way the different model types process sentences. ... : Accepted at Repl4NLP, ACL ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://dx.doi.org/10.48550/arxiv.1906.03293 https://arxiv.org/abs/1906.03293
|
|
BASE
|
|
Hide details
|
|
12 |
Compositionality decomposed: how do neural networks generalise? ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Under the Hood: Using Diagnostic Classifiers to Investigate and Improve how Language Models Track Agreement Information ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Do Language Models Understand Anything? On the Ability of LSTMs to Understand Negative Polarity Items ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
The time course of verb processing in Dutch sentences
|
|
|
|
In: http://www.cogsci.northwestern.edu/cogsci2004/papers/paper389.pdf (2004)
|
|
BASE
|
|
Show details
|
|
|
|