Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5...9

Hits 1 – 20 of 163

1	Probing for the Usage of Grammatical Number ...
	Lasri, Karim; Pimentel, Tiago; Lenci, Alessandro. - : arXiv, 2022
	BASE
	Show details

2	Estimating the Entropy of Linguistic Distributions ...
	Arora, Aryaman; Meister, Clara; Cotterell, Ryan. - : arXiv, 2022
	BASE
	Show details

3	A Latent-Variable Model for Intrinsic Probing ...
	Stańczak, Karolina; Hennigen, Lucas Torroba; Williams, Adina. - : arXiv, 2022
	BASE
	Show details

4	On Homophony and Rényi Entropy ...
	Pimentel, Tiago; Meister, Clara Isabel; Teufel, Simone. - : ETH Zurich, 2021
	BASE
	Show details

5	On Homophony and Rényi Entropy ...
	Pimentel, Tiago; Meister, Clara; Teufel, Simone. - : arXiv, 2021
	BASE
	Show details

6	On Homophony and Rényi Entropy ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Cotterell, Ryan; Meister, Clara. - : Underline Science Inc., 2021
	BASE
	Show details

7	Towards Zero-shot Language Modeling ...
	Ponti, Edoardo Maria; Vulić, Ivan; Cotterell, Ryan. - : arXiv, 2021
	BASE
	Show details

8	Differentiable Generative Phonology ...
	Wu, Shijie; Ponti, Edoardo Maria; Cotterell, Ryan. - : arXiv, 2021
	BASE
	Show details

9	Finding Concept-specific Biases in Form--Meaning Associations ...
	Pimentel, Tiago; Roark, Brian; Wichmann, Søren. - : arXiv, 2021
	BASE
	Show details

10	Searching for Search Errors in Neural Morphological Inflection ...
	Forster, Martina; Meister, Clara Isabel; Cotterell, Ryan. - : ETH Zurich, 2021
	BASE
	Show details

11	Applying the Transformer to Character-level Transduction ...
	Wu, Shijie; Cotterell, Ryan; Hulden, Mans. - : ETH Zurich, 2021
	BASE
	Show details

12	Quantifying Gender Bias Towards Politicians in Cross-Lingual Language Models ...
	Stańczak, Karolina; Choudhury, Sagnik Ray; Pimentel, Tiago. - : arXiv, 2021
	BASE
	Show details

13	Probing as Quantifying Inductive Bias ...
	Immer, Alexander; Hennigen, Lucas Torroba; Fortuin, Vincent. - : arXiv, 2021
	BASE
	Show details

14	Revisiting the Uniform Information Density Hypothesis ...
	Meister, Clara; Pimentel, Tiago; Haller, Patrick. - : arXiv, 2021
	BASE
	Show details

15	Revisiting the Uniform Information Density Hypothesis ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Cotterell, Ryan; Haller, Patrick. - : Underline Science Inc., 2021
	BASE
	Show details

16	Conditional Poisson Stochastic Beams ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Amini, Afra; Cotterell, Ryan. - : Underline Science Inc., 2021
	BASE
	Show details

17	Examining the Inductive Bias of Neural Language Models with Artificial Languages ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Cotterell, Ryan; White, Jennifer. - : Underline Science Inc., 2021
	BASE
	Show details

18	Modeling the Unigram Distribution ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Blasi, Damián; Cotterell, Ryan. - : Underline Science Inc., 2021
	BASE
	Show details

19	Language Model Evaluation Beyond Perplexity ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Cotterell, Ryan; Meister, Clara. - : Underline Science Inc., 2021
	BASE
	Show details

20	Differentiable Subset Pruning of Transformer Heads ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Cotterell, Ryan; Li, Jiaoda; Sachan, Mrinmaya. - : Underline Science Inc., 2021
	Abstract: Multi-head attention, a collection of several attention mechanisms that independently attend to different parts of the input, is the key ingredient in the Transformer. Recent work has shown, however, that a large proportion of the heads in a Transformer's multi-head attention mechanism can be safely pruned away without significantly harming the performance of the model; such pruning leads to models that are noticeably smaller and faster in practice. Our work introduces a new head pruning technique that we term differentiable subset pruning. Intuitively, our method learns per-head importance variables and then enforces a user-specified hard constraint on the number of unpruned heads. The importance variables are learned via stochastic gradient descent. We conduct experiments on natural language inference and machine translation; we show that differentiable subset pruning performs comparably or better than previous works while offering precise control of the sparsity level. ...
	Keyword: Computational Linguistics; Machine Learning; Machine Learning and Data Mining; Natural Language Processing
	URL: https://underline.io/lecture/38190-differentiable-subset-pruning-of-transformer-heads https://dx.doi.org/10.48448/bk2x-zy23
	BASE
	Hide details

Page: 1 2 3 4 5...9

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern