DE eng

Search in the Catalogues and Directories

Hits 1 – 9 of 9

1
IndoNLI: A Natural Language Inference Dataset for Indonesian ...
BASE
Show details
2
IndoNLI: A Natural Language Inference Dataset for Indonesian ...
BASE
Show details
3
What Ingredients Make for an Effective Crowdsourcing Protocol for Difficult NLU Data Collection Tasks? ...
BASE
Show details
4
Comparing Test Sets with Item Response Theory ...
BASE
Show details
5
VisualSem: A High-quality Knowledge Graph for Vision and Language ...
BASE
Show details
6
On understanding character-level models for representing morphology ...
Vania, Clara. - : The University of Edinburgh, 2020
BASE
Show details
7
On understanding character-level models for representing morphology
Vania, Clara. - : The University of Edinburgh, 2020
Abstract: Morphology is the study of how words are composed of smaller units of meaning (morphemes). It allows humans to create, memorize, and understand words in their language. To process and understand human languages, we expect our computational models to also learn morphology. Recent advances in neural network models provide us with models that compose word representations from smaller units like word segments, character n-grams, or characters. These so-called subword unit models do not explicitly model morphology yet they achieve impressive performance across many multilingual NLP tasks, especially on languages with complex morphological processes. This thesis aims to shed light on the following questions: (1) What do subword unit models learn about morphology? (2) Do we still need prior knowledge about morphology? (3) How do subword unit models interact with morphological typology? First, we systematically compare various subword unit models and study their performance across language typologies. We show that models based on characters are particularly effective because they learn orthographic regularities which are consistent with morphology. To understand which aspects of morphology are not captured by these models, we compare them with an oracle with access to explicit morphological analysis. We show that in the case of dependency parsing, character-level models are still poor in representing words with ambiguous analyses. We then demonstrate how explicit modeling of morphology is helpful in such cases. Finally, we study how character-level models perform in low resource, cross-lingual NLP scenarios, whether they can facilitate cross-linguistic transfer of morphology across related languages. While we show that cross-lingual character-level models can improve low-resource NLP performance, our analysis suggests that it is mostly because of the structural similarities between languages and we do not yet find any strong evidence of crosslinguistic transfer of morphology. This thesis presents a careful, in-depth study and analyses of character-level models and their relation to morphology, providing insights and future research directions on building morphologically-aware computational NLP models.
Keyword: character-level models; dependency parsing; morphemes; morphology; natural language processing; NLP
URL: https://doi.org/10.7488/era/49
https://hdl.handle.net/1842/36742
BASE
Hide details
8
LINSPECTOR: Multilingual Probing Tasks for Word Representations
In: Computational Linguistics, Vol 46, Iss 2, Pp 335-385 (2020) (2020)
BASE
Show details
9
CoNLL 2017 Shared Task System Outputs
Zeman, Daniel; Potthast, Martin; Straka, Milan. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2017
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
9
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern