41 |
Diffusion of Lexical Change in Social Media ...
|
|
|
|
Abstract:
Computer-mediated communication is driving fundamental changes in the nature of written language. We investigate these changes by statistical analysis of a dataset comprising 107 million Twitter messages (authored by 2.7 million unique user accounts). Using a latent vector autoregressive model to aggregate across thousands of words, we identify high-level patterns in diffusion of linguistic change over the United States. Our model is robust to unpredictable changes in Twitter's sampling rate, and provides a probabilistic characterization of the relationship of macro-scale linguistic influence to a set of demographic and geographic predictors. The results of this analysis offer support for prior arguments that focus on geographical proximity and population size. However, demographic similarity -- especially with regard to race -- plays an even more central role, as cities with similar racial demographics are far more likely to share linguistic influence. Rather than moving towards a single unified "netspeak" ... : preprint of PLOS-ONE paper from November 2014; PLoS ONE 9(11) e113114 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; FOS Physical sciences; Physics and Society physics.soc-ph; Social and Information Networks cs.SI
|
|
URL: https://arxiv.org/abs/1210.5268 https://dx.doi.org/10.48550/arxiv.1210.5268
|
|
BASE
|
|
Hide details
|
|
43 |
Discovering Sociolinguistic Associations with Structured Sparsity ...
|
|
|
|
BASE
|
|
Show details
|
|
44 |
A Latent Variable Model for Geographic Lexical Variation ...
|
|
|
|
BASE
|
|
Show details
|
|
45 |
A Latent Variable Model for Geographic Lexical Variation ...
|
|
|
|
BASE
|
|
Show details
|
|
48 |
Part-of-Speech Tagging for Twitter: Annotation, Features, and Experiments
|
|
|
|
In: DTIC (2010)
|
|
BASE
|
|
Show details
|
|
49 |
Adding More Languages Improves Unsupervised Multilingual Part-of-Speech Tagging: A Bayesian Non-Parametric Approach
|
|
|
|
In: MIT web domain (2009)
|
|
BASE
|
|
Show details
|
|
50 |
Multilingual Part-of-Speech Tagging Two Unsupervised Approaches
|
|
|
|
In: JAIR (2009)
|
|
BASE
|
|
Show details
|
|
51 |
Gesture in automatic discourse processing ; Structured models of gesture for discourse processing
|
|
|
|
BASE
|
|
Show details
|
|
|
|