1 |
A Quantitative and Qualitative Analysis of Schizophrenia Language ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Towards Responsible Natural Language Annotation for the Varieties of Arabic ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Gender Bias Amplification During Speed-Quality Optimization in Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Few-shot Learning with Multilingual Language Models ...
|
|
Lin, Xi Victoria; Mihaylov, Todor; Artetxe, Mikel; Wang, Tianlu; Chen, Shuohui; Simig, Daniel; Ott, Myle; Goyal, Naman; Bhosale, Shruti; Du, Jingfei; Pasunuru, Ramakanth; Shleifer, Sam; Koura, Punit Singh; Chaudhary, Vishrav; O'Horo, Brian; Wang, Jeff; Zettlemoyer, Luke; Kozareva, Zornitsa; Diab, Mona; Stoyanov, Veselin; Li, Xian. - : arXiv, 2021
|
|
Abstract:
Large-scale autoregressive language models such as GPT-3 are few-shot learners that can perform a wide range of language tasks without fine-tuning. While these models are known to be able to jointly represent many different languages, their training data is dominated by English, potentially limiting their cross-lingual generalization. In this work, we train multilingual autoregressive language models on a balanced corpus covering a diverse set of languages, and study their few- and zero-shot learning capabilities in a wide range of tasks. Our largest model with 7.5 billion parameters sets new state of the art in few-shot learning in more than 20 representative languages, outperforming GPT-3 of comparable size in multilingual commonsense reasoning (with +7.4% absolute accuracy improvement in 0-shot settings and +9.4% in 4-shot settings) and natural language inference (+5.4% in each of 0-shot and 4-shot settings). On the FLORES-101 machine translation benchmark, our model outperforms GPT-3 on 171 out of 182 ... : 36 pages ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2112.10668 https://dx.doi.org/10.48550/arxiv.2112.10668
|
|
BASE
|
|
Hide details
|
|
5 |
AnswerSumm: A Manually-Curated Dataset and Pipeline for Answer Summarization ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Detecting Hallucinated Content in Conditional Neural Sequence Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Gender bias amplification during Speed-Quality optimization in Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Adapting High-resource NMT Models to Translate Low-resource Related Languages without Parallel Data ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Discrete Cosine Transform as Universal Sentence Encoder ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Discrete Cosine Transform as Universal Sentence Encoder ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Detecting Urgency Status of Crisis Tweets: A Transfer Learning Approach for Low Resource Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
DeSePtion: Dual Sequence Prediction and Adversarial Examples for Improved Fact-Checking ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Mutlitask Learning for Cross-Lingual Transfer of Semantic Dependencies ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Overview for the Second Shared Task on Language Identification in Code-Switched Data ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Creating a Large Multi-Layered Representational Repository of Linguistic Code Switched Arabic Data ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Identifying Nuances in Fake News vs. Satire: Using Semantic and Linguistic Cues ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Named Entity Recognition on Code-Switched Data: Overview of the CALCS 2018 Shared Task ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|