2 |
Predicting Declension Class from Form and Meaning
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
Abstract:
The noun lexica of many natural languages are divided into several declension classes with characteristic morphological properties. Class membership is far from deterministic, but the phonological form of a noun and/or its meaning can often provide imperfect clues. Here, we investigate the strength of those clues. More specifically, we operationalize this by measuring how much information, in bits, we can glean about declension class from knowing the form and/or meaning of nouns. We know that form and meaning are often also indicative of grammatical gender—which, as we quantitatively verify, can itself share information with declension class—so we also control for gender. We find for two Indo-European languages (Czech and German) that form and meaning respectively share significant amounts of information with class (and contribute additional information above and beyond gender). The three-way interaction between class, form, and meaning (given gender) is also significant. Our study is important for two reasons: First, we introduce a new method that provides additional quantitative support for a classic linguistic finding that form and meaning are relevant for the classification of nouns into declensions. Secondly, we show not only that individual declensions classes vary in the strength of their clues within a language, but also that these variations themselves vary across languages.
|
|
URL: https://doi.org/10.3929/ethz-b-000462306 https://hdl.handle.net/20.500.11850/462306
|
|
BASE
|
|
Hide details
|
|
3 |
The Paradigm Discovery Problem
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
4 |
A Tale of a Probe and a Parser
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
5 |
A Corpus for Large-Scale Phonetic Typology
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
6 |
Information-Theoretic Probing for Linguistic Structure
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
7 |
It’s Easier to Translate out of English than into it: Measuring Neural Translation Difficulty by Cross-Mutual Information
|
|
|
|
In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics (2020)
|
|
BASE
|
|
Show details
|
|
8 |
ASSET: A dataset for tuning and evaluation of sentence simplification models with multiple rewriting transformations
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Non-linear instance-based cross-lingual mapping for non-isomorphic embedding spaces
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Classification-based self-learning for weakly supervised bilingual lexicon induction
|
|
|
|
BASE
|
|
Show details
|
|
11 |
On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation
|
|
|
|
BASE
|
|
Show details
|
|
|
|