2 |
On the Universality of Deep Contextual Language Models ...
|
|
|
|
Abstract:
Deep Contextual Language Models (LMs) like ELMO, BERT, and their successors dominate the landscape of Natural Language Processing due to their ability to scale across multiple tasks rapidly by pre-training a single model, followed by task-specific fine-tuning. Furthermore, multilingual versions of such models like XLM-R and mBERT have given promising results in zero-shot cross-lingual transfer, potentially enabling NLP applications in many under-served and under-resourced languages. Due to this initial success, pre-trained models are being used as `Universal Language Models' as the starting point across diverse tasks, domains, and languages. This work explores the notion of `Universality' by identifying seven dimensions across which a universal model should be able to scale, that is, perform equally well or reasonably well, to be useful across diverse settings. We outline the current theoretical and empirical results that support model performance across these dimensions, along with extensions that may help ... : 9 pages ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2109.07140 https://dx.doi.org/10.48550/arxiv.2109.07140
|
|
BASE
|
|
Hide details
|
|
4 |
A New Dataset for Natural Language Inference from Code-mixed Conversations ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Phone Merging for Code-switched Speech Recognition
|
|
|
|
In: Third Workshop on Computational Approaches to Linguistic Code-switching ; https://hal.inria.fr/hal-01800466 ; Third Workshop on Computational Approaches to Linguistic Code-switching, collocated with ACL 2018 Jul 2018, Melbourne, Australia (2018)
|
|
BASE
|
|
Show details
|
|
|
|