1 |
Unsupervised Multilingual Sentence Embeddings for Parallel Corpus Mining ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Beyond Offline Mapping: Learning Cross-lingual Word Embeddings through Context Anchoring ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Image Captioning for Effective Use of Language Models in Knowledge-Based Visual Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Label Verbalization and Entailment for Effective Zero and Few-Shot Relation Extraction ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Spot The Bot : a robust and efficient framework for the evaluation of conversational dialogue systems ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
DoQA : accessing domain-specific FAQs via conversational QA ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
A methodology for creating question answering corpora using inverse data annotation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
A Call for More Rigor in Unsupervised Cross-lingual Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Give your Text Representation Models some Love: the Case for Basque ...
|
|
|
|
Abstract:
Word embeddings and pre-trained language models allow to build rich representations of text and have enabled improvements across most NLP tasks. Unfortunately they are very expensive to train, and many small companies and research groups tend to use models that have been pre-trained and made available by third parties, rather than building their own. This is suboptimal as, for many languages, the models have been trained on smaller (or lower quality) corpora. In addition, monolingual pre-trained models for non-English languages are not always available. At best, models for those languages are included in multilingual versions, where each language shares the quota of substrings and parameters with the rest of the languages. This is particularly true for smaller languages such as Basque. In this paper we show that a number of monolingual models (FastText word embeddings, FLAIR and BERT language models) trained with larger Basque corpora produce much better results than publicly available versions in downstream ... : Accepted at LREC 2020; 8 pages, 7 tables ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2004.00033 https://dx.doi.org/10.48550/arxiv.2004.00033
|
|
BASE
|
|
Hide details
|
|
10 |
Improving Conversational Question Answering Systems after Deployment using Feedback-Weighted Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Beyond Offline Mapping: Learning Cross Lingual Word Embeddings through Context Anchoring ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Translation Artifacts in Cross-lingual Transfer Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Common sense or world knowledge? Investigating adapter-based knowledge injection into pretrained transformers
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Probing for Semantic Classes: Diagnosing the Meaning Content of Word Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Analyzing the Limitations of Cross-lingual Word Embedding Mappings ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Uncovering divergent linguistic information in word embeddings with lessons for intrinsic and extrinsic evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
SemEval-2017 Task 1: Semantic Textual Similarity Multilingual and Cross-lingual Focused Evaluation
|
|
|
|
In: Proceedings of the 11th International Workshop on Semantic Evaluation (SemEval-2017) ; The 11th International Workshop on Semantic Evaluation (SemEval-2017) ; https://hal.archives-ouvertes.fr/hal-01560674 ; The 11th International Workshop on Semantic Evaluation (SemEval-2017), Steven Bethard; Marine Carpuat; Marianna Apidianaki; Saif M. Mohammad; Daniel Cer; David Jurgens, Aug 2017, Vancouver, Canada. pp.1 - 14 (2017)
|
|
BASE
|
|
Show details
|
|
19 |
SemEval-2017 Task 1: Semantic Textual Similarity - Multilingual and Cross-lingual Focused Evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
QTLeap WSD/NED corpus
|
|
Agirre, Eneko; Branco, António; Popel, Martin. - : University of the Basque Country, UPV/EHU, 2015. : Faculty of Science, Univeristy of Lisbon, FCUL, 2015. : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2015. : Bulgarian Academy of Sciences, IICT-BAS, 2015
|
|
BASE
|
|
Show details
|
|
|
|