1 |
Classifying Bias in Large Multilingual Corpora via Crowdsourcing and Topic Modeling
|
|
Caljean, Brianna; Calvert, Katherine; Chang, Ashley; Frank, Elliot; Garay Jáuregui, Rosana; Palo, Geoffrey; Rinker, Ryan; Weakly, Gareth; Wolfrey, Nicolette; Zhang, William. - 2018
|
|
Abstract:
Our project extends previous algorithmic approaches to finding bias in large text corpora. We used multilingual topic modeling to examine language-specific bias in the English, Spanish, and Russian versions of Wikipedia. In particular, we placed Spanish articles discussing the Cold War on a Russian-English viewpoint spectrum based on similarity in topic distribution. We then crowdsourced human annotations of Spanish Wikipedia articles for comparison to the topic model. Our hypothesis was that human annotators and topic modeling algorithms would provide correlated results for bias. However, that was not the case. Our annotators indicated that humans were more perceptive of sentiment in article text than topic distribution, which suggests that our classifier provides a different perspective on a text’s bias.
|
|
Keyword:
Gemstone Team BIASES
|
|
URL: https://doi.org/10.13016/M2R49GC7C http://hdl.handle.net/1903/20668
|
|
BASE
|
|
Hide details
|
|
2 |
Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
A random forest system combination approach for error detection in digital dictionaries ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
A random forest system combination approach for error detection in digital dictionaries
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Citation Handling: Processing Citation Texts in Scientific Documents
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Correcting Errors in Digital Lexicographic Resources Using a Dictionary Manipulation Language
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Detecting Structural Irregularity in Electronic Dictionaries Using Language Modeling
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Citation Handling for Improved Summarization of Scientific Documents
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Error Correction for Arabic Dictionary Lookup
|
|
|
|
In: Proceedings of the Seventh International Conference on Language Resources and Evaluation (LREC 2010), Valetta, 17 - 23 May 2010 (2010), 263-268
|
|
IDS OBELEX meta
|
|
Show details
|
|
15 |
Multiple Alternative Sentene Compressions as a Tool for Automatic Summarization Tasks
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Headline Generation for Written and Broadcast News
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
17 |
Hedge Trimmer: A Parse-and-Trim Approach to Headline Generation
|
|
|
|
In: DTIC (2003)
|
|
BASE
|
|
Show details
|
|
|
|