1 |
Morphological Processing of Low-Resource Languages: Where We Are and What's Next ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Pre-Trained Multilingual Sequence-to-Sequence Models: A Hope for Low-Resource Language Translation? ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The SIGMORPHON 2019 Shared Task: Morphological Analysis in Context and Cross-Lingual Transfer for Inflection ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Modeling Color Terminology Across Thousands of Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Marrying Universal Dependencies and Universal Morphology ...
|
|
|
|
Abstract:
The Universal Dependencies (UD) and Universal Morphology (UniMorph) projects each present schemata for annotating the morphosyntactic details of language. Each project also provides corpora of annotated text in many languages - UD at the token level and UniMorph at the type level. As each corpus is built by different annotators, language-specific decisions hinder the goal of universal schemata. With compatibility of tags, each project's annotations could be used to validate the other's. Additionally, the availability of both type- and token-level resources would be a boon to tasks such as parsing and homograph disambiguation. To ease this interoperability, we present a deterministic mapping from Universal Dependencies v2 features into the UniMorph schema. We validate our approach by lookup in the UniMorph corpora and find a macro-average of 64.13% recall. We also note incompatibilities due to paucity of data on either side. Finally, we present a critical evaluation of the foundations, strengths, and ... : UDW18 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.1810.06743 https://arxiv.org/abs/1810.06743
|
|
BASE
|
|
Hide details
|
|
|
|