22 |
Towards Minimal Supervision BERT-based Grammar Error Correction ...
|
|
|
|
BASE
|
|
Show details
|
|
23 |
SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
24 |
It's not a Non-Issue: Negation as a Source of Error in Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
25 |
Automatic Extraction of Rules Governing Morphological Agreement ...
|
|
|
|
BASE
|
|
Show details
|
|
26 |
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization ...
|
|
|
|
BASE
|
|
Show details
|
|
27 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information ...
|
|
|
|
BASE
|
|
Show details
|
|
28 |
Universal Phone Recognition with a Multilingual Allophone System ...
|
|
|
|
BASE
|
|
Show details
|
|
29 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
30 |
X-FACTR: Multilingual Factual Knowledge Retrieval from Pretrained Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
31 |
AlloVera: a multilingual allophone database
|
|
|
|
In: LREC 2020: 12th Language Resources and Evaluation Conference ; https://halshs.archives-ouvertes.fr/halshs-02527046 ; LREC 2020: 12th Language Resources and Evaluation Conference, European Language Resources Association, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
32 |
Generalized Data Augmentation for Low-Resource Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
33 |
Pushing the Limits of Low-Resource Morphological Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
37 |
A small Griko-Italian speech translation corpus
|
|
|
|
In: 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18) ; https://hal.archives-ouvertes.fr/hal-01962528 ; 6th international workshop on spoken language technologies for under-resourced languages(SLTU'18), Aug 2018, New Delhi, India (2018)
|
|
Abstract:
International audience ; This paper presents an extension to a very low-resource parallel corpus collected in an endangered language, Griko, making it useful for computational research. The corpus consists of 330 utterances (about 20 minutes of speech) which have been transcribed and translated in Italian, with annotations for word-level speech-to-transcription and speech-to-translation alignments. The corpus also includes morphosyntactic tags and word-level glosses. Applying an automatic unit discovery method, pseudo-phones were also generated. We detail how the corpus was collected, cleaned and processed, and we illustrate its use on zero-resource tasks by presenting some baseline results for the task of speech-to-translation alignment and unsu-pervised word discovery. The dataset is available online, aiming to encourage replicability and diversity in computational language documentation experiments.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
|
|
URL: https://hal.archives-ouvertes.fr/hal-01962528 https://hal.archives-ouvertes.fr/hal-01962528/document https://hal.archives-ouvertes.fr/hal-01962528/file/sltu2018.pdf
|
|
BASE
|
|
Hide details
|
|
38 |
A case study on using speech-to-translation alignments for language documentation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|