DE eng

Search in the Catalogues and Directories

Hits 1 – 1 of 1

1
Estimating the Entropy of Linguistic Distributions ...
Abstract: Shannon entropy is often a quantity of interest to linguists studying the communicative capacity of human language. However, entropy must typically be estimated from observed data because researchers do not have access to the underlying probability distribution that gives rise to these data. While entropy estimation is a well-studied problem in other fields, there is not yet a comprehensive exploration of the efficacy of entropy estimators for use with linguistic data. In this work, we fill this void, studying the empirical effectiveness of different entropy estimators for linguistic distributions. In a replication of two recent information-theoretic linguistic studies, we find evidence that the reported effect size is over-estimated due to over-reliance on poor entropy estimators. Finally, we end our paper with concrete recommendations for entropy estimation depending on distribution type and data availability. ... : 21 pages (5 pages main text). 4 figures. Accepted to ACL 2022 ...
Keyword: 94A17 Primary 62B10 Secondary; Computation and Language cs.CL; FOS Computer and information sciences; I.2.7; E.4
URL: https://arxiv.org/abs/2204.01469
https://dx.doi.org/10.48550/arxiv.2204.01469
BASE
Hide details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
1
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern