3 |
AUTOLEX: An Automatic Framework for Linguistic Exploration ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
MCoNaLa: A Benchmark for Code Generation from Multiple Natural Languages ...
|
|
|
|
Abstract:
While there has been a recent burgeoning of applications at the intersection of natural and programming languages, such as code generation and code summarization, these applications are usually English-centric. This creates a barrier for program developers who are not proficient in English. To mitigate this gap in technology development across languages, we propose a multilingual dataset, MCoNaLa, to benchmark code generation from natural language commands extending beyond English. Modeled off of the methodology from the English Code/Natural Language Challenge (CoNaLa) dataset, we annotated a total of 896 NL-code pairs in three languages: Spanish, Japanese, and Russian. We present a quantitative evaluation of performance on the MCoNaLa dataset by testing with state-of-the-art code generation systems. While the difficulties vary across these three languages, all systems lag significantly behind their English counterparts, revealing the challenges in adapting code generation to new languages. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2203.08388 https://arxiv.org/abs/2203.08388
|
|
BASE
|
|
Hide details
|
|
5 |
A Systematic Evaluation of Large Language Models of Code ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Expanding Pretrained Models to Thousands More Languages via Lexicon-based Adaptation ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Attention-Passing Models for Robust and Data-Efficient End-to-End Speech Translation
|
|
|
|
In: Transactions of the Association for Computational Linguistics, 7, 313–325 ; ISSN: 2307-387X (2022)
|
|
BASE
|
|
Show details
|
|
9 |
MasakhaNER: Named entity recognition for African languages
|
|
|
|
In: EISSN: 2307-387X ; Transactions of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03350962 ; Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩ (2021)
|
|
BASE
|
|
Show details
|
|
10 |
Phoneme Recognition through Fine Tuning of Phonetic Representations: a Case Study on Luhya Language Varieties ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Few-shot Language Coordination by Modeling Theory of Mind ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Systematic Inequalities in Language Technology Performance across the World's Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
XTREME-R: Towards More Challenging and Nuanced Multilingual Evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
When Does Translation Require Context? A Data-driven, Multilingual Exploration ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Efficient Test Time Adapter Ensembling for Low-resource Language Varieties ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Distributionally Robust Multilingual Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|