Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2

Hits 1 – 20 of 29

1	Towards Unsupervised Content Disentanglement in Sentence Representations via Syntactic Roles
	Felhi, Ghazi; Le Roux, Joseph; Seddah, Djamé
	In: CtrlGen: Controllable Generative Modeling in Language and Vision ; https://hal.inria.fr/hal-03540084 ; CtrlGen: Controllable Generative Modeling in Language and Vision, Jan 2022, virtual, France (2022)
	BASE
	Show details

2	Can Character-based Language Models Improve Downstream Task Performance in Low-Resource and Noisy Language Scenarios?
	Riabi, Arij; Sagot, Benoît; Seddah, Djamé
	In: Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03527328 ; Seventh Workshop on Noisy User-generated Text (W-NUT 2021, colocated with EMNLP 2021), Jan 2022, punta cana, Dominican Republic ; https://aclanthology.org/2021.wnut-1.47/ (2022)
	BASE
	Show details

3	First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
	Muller, Benjamin; Elazar, Yanai; Sagot, Benoît...
	In: https://hal.inria.fr/hal-03161685 ; 2021 (2021)
	BASE
	Show details

4	Can Multilingual Language Models Transfer to an Unseen Dialect? A Case Study on North African Arabizi
	Muller, Benjamin; Sagot, Benoît; Seddah, Djamé
	In: https://hal.inria.fr/hal-03161677 ; 2021 (2021)
	BASE
	Show details

5	First Align, then Predict: Understanding the Cross-Lingual Ability of Multilingual BERT
	Muller, Benjamin; Elazar, Yanai; Sagot, Benoît...
	In: EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03239087 ; EACL 2021 - The 16th Conference of the European Chapter of the Association for Computational Linguistics, Apr 2021, Kyiv / Virtual, Ukraine ; https://2021.eacl.org/ (2021)
	BASE
	Show details

6	When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
	Muller, Benjamin; Anastasopoulos, Antonios; Sagot, Benoît; Seddah, Djamé
	In: NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies ; https://hal.inria.fr/hal-03251105 ; NAACL-HLT 2021 - 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, Jun 2021, Mexico City, Mexico (2021)
	Abstract: International audience ; Transfer learning based on pretraining language models on a large amount of raw data has become a new norm to reach state-of-theart performance in NLP. Still, it remains unclear how this approach should be applied for unseen languages that are not covered by any available large-scale multilingual language model and for which only a small amount of raw data is generally available. In this work, by comparing multilingual and monolingual models, we show that such models behave in multiple ways on unseen languages. Some languages greatly benefit from transfer learning and behave similarly to closely related high resource languages whereas others apparently do not. Focusing on the latter, we show that this failure to transfer is largely related to the impact of the script used to write such languages. We show that transliterating those languages significantly improves the potential of large-scale multilingual language models on downstream tasks. This result provides a promising direction towards making these massively multilingual models useful for a new set of unseen languages.
	Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
	URL: https://hal.inria.fr/hal-03251105/file/NAACL21_Muller_et_al.pdf https://hal.inria.fr/hal-03251105/document https://hal.inria.fr/hal-03251105
	BASE
	Hide details

7	PAGnol: An Extra-Large French Generative Model
	Launay, Julien; Tommasone, Giuseppe Luca; Pannier, Baptiste...
	In: https://hal.inria.fr/hal-03540159 ; [Research Report] LightON. 2021 (2021)
	BASE
	Show details

8	Synthetic Data Augmentation for Zero-Shot Cross-Lingual Question Answering
	Riabi, Arij; Scialom, Thomas; Keraron, Rachel...
	In: https://hal.inria.fr/hal-03109187 ; 2021 (2021)
	BASE
	Show details

9	Noisy UGC Translation at the Character Level: Revisiting Open-Vocabulary Capabilities and Robustness of Char-Based Models
	Núñez, José Carlos Rosales; Wisniewski, Guillaume; Seddah, Djamé
	In: W-NUT 2021 - 7th Workshop on Noisy User-generated Text (colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03540174 ; W-NUT 2021 - 7th Workshop on Noisy User-generated Text (colocated with EMNLP 2021), Association for computational linguistics, Nov 2021, Punta Cana, Dominican Republic (2021)
	BASE
	Show details

10	Understanding the Impact of UGC Specificities on Translation Quality
	Rosales Nunez, José Carlos; Seddah, Djamé; Wisniewski, Guillaume
	In: W-NUT 2021 - Seventh Workshop on Noisy User-generated Text (colocated with EMNLP 2021) ; https://hal.inria.fr/hal-03540175 ; W-NUT 2021 - Seventh Workshop on Noisy User-generated Text (colocated with EMNLP 2021), association for computational linguistics, Nov 2021, Punta Cana, Dominican Republic (2021)
	BASE
	Show details

11	Challenging the Semi-Supervised VAE Framework for Text Classification
	Felhi, Ghazi; Roux, Joseph Le; Seddah, Djamé
	In: Second Workshop on Insights from Negative Results in NLP (colocated with EMNLP) ; https://hal.inria.fr/hal-03540081 ; Second Workshop on Insights from Negative Results in NLP (colocated with EMNLP), Nov 2021, Punta Cana, Dominican Republic ; https://insights-workshop.github.io/2021/ (2021)
	BASE
	Show details

12	Building a User-Generated Content North-African Arabizi Treebank: Tackling Hell
	Seddah, Djamé; Essaidi, Farah; Fethi, Amal...
	In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02889804 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, Canada. ⟨10.18653/v1/2020.acl-main.107⟩ (2020)
	BASE
	Show details

13	CamemBERT: a Tasty French Language Model
	Martin, Louis; Muller, Benjamin; Ortiz Suárez, Pedro Javier...
	In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02889805 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, United States. ⟨10.18653/v1/2020.acl-main.645⟩ (2020)
	BASE
	Show details

14	Simple, Interpretable and Stable Method for Detecting Words with Usage Change across Corpora
	Gonen, Hila; Jawahar, Ganesh; Seddah, Djamé...
	In: ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03161637 ; ACL 2020 - 58th Annual Meeting of the Association for Computational Linguistics, Jul 2020, Seattle / Virtual, United States. pp.538-555, ⟨10.18653/v1/2020.acl-main.51⟩ (2020)
	BASE
	Show details

15	When Being Unseen from mBERT is just the Beginning: Handling New Languages With Multilingual Language Models
	Muller, Benjamin; Anastasopoulos, Antonis; Sagot, Benoît...
	In: https://hal.inria.fr/hal-03109106 ; 2020 (2020)
	BASE
	Show details

16	Unsupervised Learning for Handling Code-Mixed Data: A Case Study on POS Tagging of North-African Arabizi Dialect
	Srivastava, Abhishek; Muller, Benjamin; Seddah, Djamé
	In: EurNLP - First annual EurNLP ; https://hal.archives-ouvertes.fr/hal-02270527 ; EurNLP - First annual EurNLP, Oct 2019, Londres, United Kingdom (2019)
	BASE
	Show details

17	CamemBERT: a Tasty French Language Model
	Martin, Louis; Muller, Benjamin; Ortiz Suárez, Pedro Javier...
	In: https://hal.inria.fr/hal-02445946 ; 2019 (2019)
	BASE
	Show details

18	Enhancing BERT for Lexical Normalization
	Muller, Benjamin; Sagot, Benoît; Seddah, Djamé
	In: The 5th Workshop on Noisy User-generated Text (W-NUT) ; https://hal.inria.fr/hal-02294316 ; The 5th Workshop on Noisy User-generated Text (W-NUT), Nov 2019, Hong Kong, China (2019)
	BASE
	Show details

19	What does BERT learn about the structure of language?
	Jawahar, Ganesh; Sagot, Benoît; Seddah, Djamé
	In: ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-02131630 ; ACL 2019 - 57th Annual Meeting of the Association for Computational Linguistics, Jul 2019, Florence, Italy (2019)
	BASE
	Show details

20	Contextualized Diachronic Word Representations
	Jawahar, Ganesh; Seddah, Djamé
	In: 1st International Workshop on Computational Approaches to Historical Language Change 2019 (colocated with ACL 2019) ; https://hal.archives-ouvertes.fr/hal-02194763 ; 1st International Workshop on Computational Approaches to Historical Language Change 2019 (colocated with ACL 2019), Aug 2019, Florence, Italy (2019)
	BASE
	Show details

Page: 1 2

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern