Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6 7 8 9...690

Hits 81 – 100 of 13.783

81	Adapting BigScience Multilingual Model to Unseen Languages ...
	Yong, Zheng-Xin; Nikoulina, Vassilina. - : arXiv, 2022
	BASE
	Show details

82	On Efficiently Acquiring Annotations for Multilingual Models ...
	Moniz, Joel Ruben Antony; Patra, Barun; Gormley, Matthew R.. - : arXiv, 2022
	BASE
	Show details

83	Team ÚFAL at CMCL 2022 Shared Task: Figuring out the correct recipe for predicting Eye-Tracking features using Pretrained Language Models ...
	Bhattacharya, Sunit; Kumar, Rishu; Bojar, Ondrej. - : arXiv, 2022
	BASE
	Show details

84	Does Corpus Quality Really Matter for Low-Resource Languages? ...
	Artetxe, Mikel; Aldabe, Itziar; Agerri, Rodrigo. - : arXiv, 2022
	BASE
	Show details

85	IIITDWD-ShankarB@ Dravidian-CodeMixi-HASOC2021: mBERT based model for identification of offensive content in south Indian languages ...
	Biradar, Shankar; Saumya, Sunil. - : arXiv, 2022
	BASE
	Show details

86	mSLAM: Massively multilingual joint pre-training for speech and text ...
	Bapna, Ankur; Cherry, Colin; Zhang, Yu. - : arXiv, 2022
	BASE
	Show details

87	On the Representation Collapse of Sparse Mixture of Experts ...
	Chi, Zewen; Dong, Li; Huang, Shaohan; Dai, Damai; Ma, Shuming; Patra, Barun; Singhal, Saksham; Bajaj, Payal; Song, Xia; Wei, Furu. - : arXiv, 2022
	Abstract: Sparse mixture of experts provides larger model capacity while requiring a constant computational overhead. It employs the routing mechanism to distribute input tokens to the best-matched experts according to their hidden representations. However, learning such a routing mechanism encourages token clustering around expert centroids, implying a trend toward representation collapse. In this work, we propose to estimate the routing scores between tokens and experts on a low-dimensional hypersphere. We conduct extensive experiments on cross-lingual language model pre-training and fine-tuning on downstream tasks. Experimental results across seven multilingual benchmarks show that our method achieves consistent gains. We also present a comprehensive analysis on the representation and routing behaviors of our models. Our method alleviates the representation collapse issue and achieves more consistent routing than the baseline mixture-of-experts methods. ...
	Keyword: Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
	URL: https://dx.doi.org/10.48550/arxiv.2204.09179 https://arxiv.org/abs/2204.09179
	BASE
	Hide details

88	Politics and Virality in the Time of Twitter: A Large-Scale Cross-Party Sentiment Analysis in Greece, Spain and United Kingdom ...
	Antypas, Dimosthenis; Preece, Alun; Collados, Jose Camacho. - : arXiv, 2022
	BASE
	Show details

89	L3Cube-MahaHate: A Tweet-based Marathi Hate Speech Detection Dataset and BERT models ...
	Velankar, Abhishek; Patil, Hrushikesh; Gore, Amol. - : arXiv, 2022
	BASE
	Show details

90	Few-Shot Cross-lingual Transfer for Coarse-grained De-identification of Code-Mixed Clinical Texts ...
	Amin, Saadullah; Goldstein, Noon Pokaratsiri; Wixted, Morgan Kelly. - : arXiv, 2022
	BASE
	Show details

91	A Unified Strategy for Multilingual Grammatical Error Correction with Pre-trained Cross-Lingual Language Model ...
	Sun, Xin; Ge, Tao; Ma, Shuming. - : arXiv, 2022
	BASE
	Show details

92	A New Generation of Perspective API: Efficient Multilingual Character-level Transformers ...
	Lees, Alyssa; Tran, Vinh Q.; Tay, Yi. - : arXiv, 2022
	BASE
	Show details

93	Factual Consistency of Multilingual Pretrained Language Models ...
	Fierro, Constanza; Søgaard, Anders. - : arXiv, 2022
	BASE
	Show details

94	Examining Scaling and Transfer of Language Model Architectures for Machine Translation ...
	Zhang, Biao; Ghorbani, Behrooz; Bapna, Ankur. - : arXiv, 2022
	BASE
	Show details

95	MuMiN: A Large-Scale Multilingual Multimodal Fact-Checked Misinformation Social Network Dataset ...
	Nielsen, Dan Saattrup; McConville, Ryan. - : arXiv, 2022
	BASE
	Show details

96	Mono vs Multilingual BERT for Hate Speech Detection and Text Classification: A Case Study in Marathi ...
	Velankar, Abhishek; Patil, Hrushikesh; Joshi, Raviraj. - : arXiv, 2022
	BASE
	Show details

97	Agreement ...
	Tal, Shira. - : Open Science Framework, 2022
	BASE
	Show details

98	Agreement ...
	Tal, Shira. - : Open Science Framework, 2022
	BASE
	Show details

99	Natural Language Descriptions of Deep Visual Features ...
	Hernandez, Evan; Schwettmann, Sarah; Bau, David. - : arXiv, 2022
	BASE
	Show details

100	From Examples to Rules: Neural Guided Rule Synthesis for Information Extraction ...
	Vacareanu, Robert; Valenzuela-Escarcega, Marco A.; Barbosa, George C. G.. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5 6 7 8 9...690

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern