21 |
Évaluation des propriétés multilingues d'un embedding contextualisé
|
|
|
|
In: EGC 2022 - Conférence francophone sur l'Extraction et la Gestion des Connaissances ; https://hal.archives-ouvertes.fr/hal-03578480 ; EGC 2022 - Conférence francophone sur l'Extraction et la Gestion des Connaissances, Jan 2022, Blois, France (2022)
|
|
BASE
|
|
Show details
|
|
22 |
Probing Multilingual Cognate Prediction Models
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL 2022 ; https://hal.inria.fr/hal-03614691 ; Findings of the Association for Computational Linguistics: ACL 2022, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
23 |
Introducing the HIPE 2022 Shared Task: Named Entity Recognition and Linking in Multilingual Historical Documents
|
|
|
|
In: Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II ; https://hal.archives-ouvertes.fr/hal-03635971 ; Matthias Hagen; Suzan Verberne; Craig Macdonald; Christin Seifert; Krisztian Balog; Kjetil Nørvåg; Vinay Setty. Advances in Information Retrieval. 44th European Conference on IR Research, ECIR 2022, Stavanger, Norway, April 10–14, 2022, Proceedings, Part II, 13186, Springer International Publishing, pp.347-354, 2022, Lecture Notes in Computer Science, 978-3-030-99738-0. ⟨10.1007/978-3-030-99739-7_44⟩ (2022)
|
|
BASE
|
|
Show details
|
|
24 |
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
|
|
|
|
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
|
|
BASE
|
|
Show details
|
|
25 |
Automatic Speech Recognition and Query By Example for Creole Languages Documentation
|
|
|
|
In: Findings of the Association for Computational Linguistics: ACL 2022 ; https://hal.archives-ouvertes.fr/hal-03625303 ; Findings of the Association for Computational Linguistics: ACL 2022, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
26 |
Cross-lingual few-shot hate speech and offensive language detection using meta learning
|
|
|
|
In: ISSN: 2169-3536 ; EISSN: 2169-3536 ; IEEE Access ; https://hal.archives-ouvertes.fr/hal-03559484 ; IEEE Access, IEEE, 2022, 10, pp.14880-14896. ⟨10.1109/ACCESS.2022.3147588⟩ (2022)
|
|
BASE
|
|
Show details
|
|
27 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
28 |
Cross-Situational Learning Towards Robot Grounding
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03628290 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
29 |
PROTECT: A Pipeline for Propaganda Detection and Classification
|
|
|
|
In: CLiC-it 2021- Italian Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03417019 ; CLiC-it 2021- Italian Conference on Computational Linguistics, Jan 2022, Milan, Italy (2022)
|
|
BASE
|
|
Show details
|
|
30 |
PROTECT - A Pipeline for Propaganda Detection and Classification
|
|
|
|
In: Eighth Italian Conference on Computational Linguistics (CLIC-it 2021) ; https://hal.archives-ouvertes.fr/hal-03417019 ; Eighth Italian Conference on Computational Linguistics (CLIC-it 2021), Jan 2022, Milan, Italy (2022)
|
|
BASE
|
|
Show details
|
|
31 |
Building infrastructure for annotating medieval, classical and pre-orthographic languages: the Pyrrha ecosystem
|
|
|
|
In: Digital Humanities 2022 (DH2022) ; https://hal.archives-ouvertes.fr/hal-03606756 ; Digital Humanities 2022 (DH2022), Jul 2022, Tokyo, Japan ; https://dh2022.adho.org/ (2022)
|
|
BASE
|
|
Show details
|
|
32 |
Evaluation of Speaker Anonymization on Emotional Speech ; Analyse de l'anonymisation du locuteur sur de la parole émotionnelle
|
|
|
|
In: JEP2022 - Journées d'Études sur la Parole ; https://hal.archives-ouvertes.fr/hal-03636737 ; JEP2022 - Journées d'Études sur la Parole, Jun 2022, Île de Noirmoutier, France (2022)
|
|
BASE
|
|
Show details
|
|
33 |
Language identification, a tool for Corsican and for the evaluation of linguistic resources ; L'identification de langue, un outil au service du corse et de l'évaluation des ressources linguistiques
|
|
|
|
In: Traitement Automatique des Langues ; https://hal.archives-ouvertes.fr/hal-03633290 ; Traitement Automatique des Langues, 2022, Diversité Linguistique, 62 (3), pp.13-37 ; https://www.atala.org/content/diversité-linguistique-linguistic-diversity-natural-language-processing (2022)
|
|
BASE
|
|
Show details
|
|
34 |
Imputing Out-of-Vocabulary Embeddings with LOVE Makes Language Models Robust with Little Cost
|
|
|
|
In: ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03613101 ; ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland (2022)
|
|
BASE
|
|
Show details
|
|
35 |
Imputing out-of-vocabulary embeddings with LOVE makes language models robust with little cost
|
|
|
|
In: ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-03613101 ; ACL 2022 - 60th Annual Meeting of the Association for Computational Linguistics, May 2022, Dublin, Ireland (2022)
|
|
Abstract:
International audience ; State-of-the-art NLP systems represent inputs with word embeddings, but these are brittle when faced with Out-of-Vocabulary (OOV) words. To address this issue, we follow the principle of mimick-like models to generate vectors for unseen words, by learning the behavior of pre-trained embeddings using only the surface form of words. We present a simple contrastive learning framework, LOVE, which extends the word representation of an existing pre-trained language model (such as BERT), and makes it robust to OOV with few additional parameters. Extensive evaluations demonstrate that our lightweight model achieves similar or even better performances than prior competitors, both on original datasets and on corrupted variants. Moreover, it can be used in a plug-and-play fashion with FastText and BERT, where it significantly improves their robustness.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; Language models; Out-of-vocabulary OOV words; Word embeddings
|
|
URL: https://hal.archives-ouvertes.fr/hal-03613101 https://hal.archives-ouvertes.fr/hal-03613101/document https://hal.archives-ouvertes.fr/hal-03613101/file/Imputing%20OOV%20Embeddings%20with%20LOVE.pdf
|
|
BASE
|
|
Hide details
|
|
36 |
Word Sense Induction with Attentive Context Clustering
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03586559 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
37 |
Word Sense Induction with Attentive Context Clustering
|
|
|
|
In: https://hal.inria.fr/hal-03586559 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
38 |
Word Sense Induction with Attentive Context Clustering
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03586559 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
39 |
Can machines learn to see without visual databases?
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03526569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
40 |
Quality Assurance of Generative Dialog Models in an Evolving Conversational Agent Used for Swedish Language Practice ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|