2 |
The CLASSLA-StanfordNLP model for lemmatisation of standard Slovenian 1.4
|
|
|
|
BASE
|
|
Show details
|
|
4 |
The Twitter user dataset for discriminating between Bosnian, Croatian, Montenegrin and Serbian Twitter-HBS 1.0
|
|
|
|
BASE
|
|
Show details
|
|
8 |
The news dataset for discriminating between Bosnian, Croatian and Serbian SETimes.HBS 1.0
|
|
|
|
BASE
|
|
Show details
|
|
9 |
The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Slovenian 1.3
|
|
|
|
BASE
|
|
Show details
|
|
10 |
The GINCO Training Dataset for Web Genre Identification of Documents Out in the Wild ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Retweet communities reveal the main sources of hate speech
|
|
|
|
In: PLoS One (2022)
|
|
Abstract:
We address a challenging problem of identifying main sources of hate speech on Twitter. On one hand, we carefully annotate a large set of tweets for hate speech, and deploy advanced deep learning to produce high quality hate speech classification models. On the other hand, we create retweet networks, detect communities and monitor their evolution through time. This combined approach is applied to three years of Slovenian Twitter data. We report a number of interesting results. Hate speech is dominated by offensive tweets, related to political and ideological issues. The share of unacceptable tweets is moderately increasing with time, from the initial 20% to 30% by the end of 2020. Unacceptable tweets are retweeted significantly more often than acceptable tweets. About 60% of unacceptable tweets are produced by a single right-wing community of only moderate size. Institutional Twitter accounts and media accounts post significantly less unacceptable tweets than individual accounts. In fact, the main sources of unacceptable tweets are anonymous accounts, and accounts that were suspended or closed during the years 2018–2020.
|
|
Keyword:
Research Article
|
|
URL: http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8929563/ https://doi.org/10.1371/journal.pone.0265602
|
|
BASE
|
|
Hide details
|
|
14 |
The ParlaMint corpora of parliamentary proceedings
|
|
|
|
In: Lang Resour Eval (2022)
|
|
BASE
|
|
Show details
|
|
18 |
Choice of plausible alternatives dataset in Croatian COPA-HR
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Croatian corpus of non-professional written language by typical speakers and speakers with language disorders RAPUT 1.0
|
|
|
|
BASE
|
|
Show details
|
|
20 |
The Orange workflow for observing collocation trends ColTrend 1.0
|
|
|
|
BASE
|
|
Show details
|
|
|
|