3 |
Listening to Affected Communities to Define Extreme Speech: Dataset and Experiments ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Differentiable Multi-Agent Actor-Critic for Multi-Step Radiology Report Summarization ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Geographic Adaptation of Pretrained Language Models ...
|
|
|
|
Abstract:
Geographic linguistic features are commonly used to improve the performance of pretrained language models (PLMs) on NLP tasks where geographic knowledge is intuitively beneficial (e.g., geolocation prediction and dialect feature prediction). Existing work, however, leverages such geographic information in task-specific fine-tuning, failing to incorporate it into PLMs' geo-linguistic knowledge, which would make it transferable across different tasks. In this work, we introduce an approach to task-agnostic geoadaptation of PLMs that forces the PLM to learn associations between linguistic phenomena and geographic locations. More specifically, geoadaptation is an intermediate training step that couples masked language modeling and geolocation prediction in a dynamic multitask learning setup. In our experiments, we geoadapt BERTić -- a PLM for Bosnian, Croatian, Montenegrin, and Serbian (BCMS) -- using a corpus of geotagged BCMS tweets. Evaluation on three different tasks, namely unsupervised (zero-shot) and ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2203.08565 https://dx.doi.org/10.48550/arxiv.2203.08565
|
|
BASE
|
|
Hide details
|
|
9 |
Graph Algorithms for Multiparallel Word Alignment
|
|
|
|
In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing ; The 2021 Conference on Empirical Methods in Natural Language Processing ; https://hal.archives-ouvertes.fr/hal-03424044 ; The 2021 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Nov 2021, Punta Cana, Dominica ; https://2021.emnlp.org/ (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Does He Wink or Does He Nod? A Challenging Benchmark for Evaluating Word Understanding of Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Superbizarre Is Not Superb: Derivational Morphology Improves BERT's Interpretation of Complex Words ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
ParCourE: A Parallel Corpus Explorer for a Massively Multilingual Corpus ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Multilingual LAMA: Investigating Knowledge in Multilingual Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Wine is Not v i n. -- On the Compatibility of Tokenizations Across Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Locating Language-Specific Information in Contextualized Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Measuring and Improving Consistency in Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|