1 |
Verb-argument lability and its correlations with other typological parameters. A quantitative corpus-based study ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Frequency, Informativity and Word Length: Insights from Typologically Diverse Corpora
|
|
|
|
In: Entropy (Basel) (2022)
|
|
Abstract:
Zipf’s law of abbreviation, which posits a negative correlation between word frequency and length, is one of the most famous and robust cross-linguistic generalizations. At the same time, it has been shown that contextual informativity (average surprisal given previous context) is more strongly correlated with word length, although this tendency is not observed consistently, depending on several methodological choices. The present study examines a more diverse sample of languages than the previous studies (Arabic, Finnish, Hungarian, Indonesian, Russian, Spanish and Turkish). I use large web-based corpora from the Leipzig Corpora Collection to estimate word lengths in UTF-8 characters and in phonemes (for some of the languages), as well as word frequency, informativity given previous word and informativity given next word, applying different methods of bigrams processing. The results show different correlations between word length and the corpus-based measure for different languages. I argue that these differences can be explained by the properties of noun phrases in a language, most importantly, by the order of heads and modifiers and their relative morphological complexity, as well as by orthographic conventions.
|
|
Keyword:
Article
|
|
URL: https://doi.org/10.3390/e24020280 http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8870940/
|
|
BASE
|
|
Hide details
|
|
3 |
Cross-linguistic differential and optional marking database ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Cross-linguistic differential and optional marking database ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Loose and tight languages: A typology based on associations between constructions and lexemes ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Loose and tight languages: A typology based on associations between constructions and lexemes ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Cross-Linguistic Trade-Offs and Causal Relationships Between Cues to Grammatical Subject and Object, and the Problem of Efficiency-Related Explanations
|
|
|
|
In: Front Psychol (2021)
|
|
BASE
|
|
Show details
|
|
8 |
Database of Annotated Core Arguments: English, Lao and Russian ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Database of Annotated Core Arguments: English, Lao and Russian ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Linguistic Frankenstein, or How to test universal constraints without real languages ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Explanation in typology: Diachronic sources, functional motivations and the nature of the evidence ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Linguistic Frankenstein, or How to test universal constraints without real languages ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Explanation in typology: Diachronic sources, functional motivations and the nature of the evidence ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Anybody (at) home? Communicative efficiency knocking on the Construction Grammar door (draft) ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Anybody (at) home? Communicative efficiency knocking on the Construction Grammar door (draft) ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Explanation in typology: Diachronic sources, functional motivations and the nature of the evidence
|
|
|
|
In: Language Science Press; (2018)
|
|
BASE
|
|
Show details
|
|
|
|