21 |
Evaluating Multilingual Text Encoders for Unsupervised Cross-Lingual Retrieval ...
|
|
|
|
BASE
|
|
Show details
|
|
22 |
RedditBias: A Real-World Resource for Bias Evaluation and Debiasing of Conversational Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
23 |
Parameter space factorization for zero-shot learning across tasks and languages ...
|
|
|
|
BASE
|
|
Show details
|
|
25 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
Abstract:
Recent work indicated that pretrained language models (PLMs) such as BERT and RoBERTa can be transformed into effective sentence and word encoders even via simple self-supervised techniques. Inspired by this line of work, in this paper we propose a fully unsupervised approach to improving word-in-context (WiC) representations in PLMs, achieved via a simple and efficient WiC-targeted fine-tuning procedure: MirrorWiC. The proposed method leverages only raw texts sampled from Wikipedia, assuming no sense-annotated data, and learns context-aware word representations within a standard contrastive learning setup. We experiment with a series of standard and comprehensive WiC benchmarks across multiple languages. Our proposed fully unsupervised MirrorWiC models obtain substantial gains over off-the-shelf PLMs across all monolingual, multilingual and cross-lingual setups. Moreover, on some standard WiC benchmarks, MirrorWiC is even on-par with supervised models fine-tuned with in-task data and sense labels. ...
|
|
Keyword:
cs.CL
|
|
URL: https://dx.doi.org/10.17863/cam.78495 https://www.repository.cam.ac.uk/handle/1810/331050
|
|
BASE
|
|
Hide details
|
|
26 |
UNKs Everywhere: Adapting Multilingual Language Models to New Scripts ...
|
|
|
|
BASE
|
|
Show details
|
|
27 |
How Good is Your Tokenizer? On the Monolingual Performance of Multilingual Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
28 |
Learning Domain-Specialised Representations for Cross-Lingual Biomedical Entity Linking ...
|
|
|
|
BASE
|
|
Show details
|
|
29 |
MirrorWiC: On Eliciting Word-in-Context Representations from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
30 |
Multilingual and Cross-Lingual Intent Detection from Spoken Data ...
|
|
|
|
BASE
|
|
Show details
|
|
31 |
Semantic Data Set Construction from Human Clustering and Spatial Arrangement ...
|
|
|
|
BASE
|
|
Show details
|
|
32 |
AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples ...
|
|
|
|
BASE
|
|
Show details
|
|
33 |
Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders ...
|
|
|
|
BASE
|
|
Show details
|
|
34 |
Parameter space factorization for zero-shot learning across tasks and languages
|
|
|
|
In: Transactions of the Association for Computational Linguistics, 9 (2021)
|
|
BASE
|
|
Show details
|
|
35 |
AM2iCo: Evaluating Word Meaning in Context across Low-Resource Languages with Adversarial Examples ...
|
|
|
|
BASE
|
|
Show details
|
|
36 |
Fast, Effective, and Self-Supervised: Transforming Masked Language Models into Universal Lexical and Sentence Encoders ...
|
|
|
|
BASE
|
|
Show details
|
|
37 |
LexFit: Lexical Fine-Tuning of Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
38 |
Verb Knowledge Injection for Multilingual Event Processing ...
|
|
|
|
BASE
|
|
Show details
|
|
39 |
A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters ...
|
|
|
|
BASE
|
|
Show details
|
|
40 |
Is supervised syntactic parsing beneficial for language understanding tasks? An empirical investigation
|
|
|
|
BASE
|
|
Show details
|
|
|
|