DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 36

1
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
BASE
Show details
2
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
In: https://hal.inria.fr/hal-03161685 ; 2021 (2021)
BASE
Show details
3
Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi
In: https://hal.inria.fr/hal-03161677 ; 2021 (2021)
Abstract: Building natural language processing systems for non standardized and low resource languages is a difficult challenge. The recent success of large-scale multilingual pretrained language models provides new modeling tools to tackle this. In this work, we study the ability of multilingual language models to process an unseen dialect. We take user generated North-African Arabic as our case study, a resource-poor dialectal variety of Arabic with frequent code-mixing with French and written in Arabizi, a non-standardized transliteration of Arabic to Latin script. Focusing on two tasks, part-of-speech tagging and dependency parsing, we show in zero-shot and unsupervised adaptation scenarios that multilingual language models are able to transfer to such an unseen dialect, specifically in two extreme cases: (i) across scripts, using Modern Standard Arabic as a source language, and (ii) from a distantly related language, unseen during pretraining, namely Maltese. Our results constitute the first successful transfer experiments on this dialect, paving thus the way for the development of an NLP ecosystem for resource-scarce, non-standardized and highly variable vernacular languages.
Keyword: [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing
URL: https://hal.inria.fr/hal-03161677
BASE
Hide details
4
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
In: EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03239087 ; EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kyiv / Virtual, Ukraine ; https://2021.eacl.org/ (2021)
BASE
Show details
5
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
In: NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies ; https://hal.inria.fr/hal-03251105 ; NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 2021, Mexico City, Mexico (2021)
BASE
Show details
6
Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering
In: https://hal.inria.fr/hal-03109187 ; 2021 (2021)
BASE
Show details
7
Universal Dependencies 2.9
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2021
BASE
Show details
8
Universal Dependencies 2.8.1
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2021
BASE
Show details
9
Universal Dependencies 2.8
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2021
BASE
Show details
10
Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios? ...
BASE
Show details
11
First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT ...
BASE
Show details
12
Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell
In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02889804 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, Canada. ⟨10.18653/v1/2020.acl-main.107⟩ (2020)
BASE
Show details
13
CamemBERT: a Tasty French Language Model
In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02889805 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, United States. ⟨10.18653/v1/2020.acl-main.645⟩ (2020)
BASE
Show details
14
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
In: https://hal.inria.fr/hal-03109106 ; 2020 (2020)
BASE
Show details
15
Universal Dependencies 2.7
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
BASE
Show details
16
Universal Dependencies 2.6
Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
BASE
Show details
17
Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering ...
BASE
Show details
18
When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models ...
BASE
Show details
19
CamemBERT: a Tasty French Language Model
In: https://hal.inria.fr/hal-02445946 ; 2019 (2019)
BASE
Show details
20
Enhancing BERT for Lexical Normalization
In: The 5th Workshop on Noisy User-generated Text (W-NUT) ; https://hal.inria.fr/hal-02294316 ; The 5th Workshop on Noisy User-generated Text (W-NUT), Nov 2019, Hong Kong, China (2019)
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
36
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern