1 |
Fairlex: A multilingual benchmark for evaluating fairness in legal text processing ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Fairlex: A multilingual benchmark for evaluating fairness in legal text processing ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
FairLex: A Multilingual Benchmark for Evaluating Fairness in Legal Text Processing ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Generalized Quantifiers as a Source of Error in Multilingual NLU Benchmarks ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Factual Consistency of Multilingual Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Zero-Shot Dependency Parsing with Worst-Case Aware Automated Curriculum Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
How Conservative are Language Models? Adapting to the Introduction of Gender-Neutral Pronouns ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Replicating and Extending "Because Their Treebanks Leak": Graph Isomorphism, Covariants, and Parser Performance ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
The Impact of Positional Encodings on Multilingual Compression ...
|
|
|
|
Abstract:
In order to preserve word-order information in a non-autoregressive setting, transformer architectures tend to include positional knowledge, by (for instance) adding positional encodings to token embeddings. Several modifications have been proposed over the sinusoidal positional encodings used in the original transformer architecture; these include, for instance, separating position encodings and token embeddings, or directly modifying attention weights based on the distance between word pairs. We first show that surprisingly, while these modifications tend to improve monolingual language models, none of them result in better multilingual language models. We then answer why that is: Sinusoidal encodings were explicitly designed to facilitate compositionality by allowing linear projections over arbitrary time steps. Higher variances in multilingual training distributions requires higher compression, in which case, compositionality becomes indispensable. Learned absolute positional encodings (e.g., in mBERT) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2109.05388 https://dx.doi.org/10.48550/arxiv.2109.05388
|
|
BASE
|
|
Hide details
|
|
13 |
Minimax and Neyman–Pearson Meta-Learning for Outlier Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Evaluation of Summarization Systems across Gender, Age, and Race ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Replicating and Extending ``Because Their Treebanks Leak'': Graph Isomorphism, Covariants, and Parser Performance ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Can Language Models Encode Perceptual Structure Without Grounding? A Case Study in Color ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Minimax and Neyman–Pearson Meta-Learning for Outlier Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|