1 |
Cross-lingual few-shot hate speech and offensive language detection using meta learning
|
|
|
|
In: ISSN: 2169-3536 ; EISSN: 2169-3536 ; IEEE Access ; https://hal.archives-ouvertes.fr/hal-03559484 ; IEEE Access, IEEE, 2022, 10, pp.14880-14896. ⟨10.1109/ACCESS.2022.3147588⟩ (2022)
|
|
Abstract:
International audience ; Automatic detection of abusive online content such as hate speech, offensive language, threats, etc. has become prevalent in social media, with multiple efforts dedicated to detecting this phenomenon in English. However, detecting hatred and abuse in low-resource languages is a non-trivial challenge. The lack of sufficient labeled data in low-resource languages and inconsistent generalization ability of transformer-based multilingual pre-trained language models for typologically diverse languages make these models inefficient in some cases. We propose a meta learning-based approach to study the problem of few-shot hate speech and offensive language detection in low-resource languages that will allow hateful or offensive content to be predicted by only observing a few labeled data items in a specific target language. We investigate the feasibility of applying a meta learning approach in cross-lingual few-shot hate speech detection by leveraging two meta learning models based on optimization-based and metric-based (MAML and Proto-MAML) methods. To the best of our knowledge, this is the first effort of this kind. To evaluate the performance of our approach, we consider hate speech and offensive language detection as two separate tasks and make two diverse collections of different publicly available datasets comprising 15 datasets across 8 languages for hate speech and 6 datasets across 6 languages for offensive language. Our experiments show that meta learning-based models outperform transfer learning-based models in a majority of cases, and that Proto-MAML is the best performing model, as it can quickly generalize and adapt to new languages with only a few labeled data points (generally, 16 samples per class yields an effective performance) to identify hateful or offensive content.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-NI]Computer Science [cs]/Networking and Internet Architecture [cs.NI]; [INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI]; Cross-lingual classification; Few-shot learning; Hate speech; Meta learning; Offensive language; Transfer learning; XLMRoBERTa
|
|
URL: https://doi.org/10.1109/ACCESS.2022.3147588 https://hal.archives-ouvertes.fr/hal-03559484
|
|
BASE
|
|
Hide details
|
|
2 |
“Thou Shalt Not Take the Lord’s Name in Vain”: A Methodological Proposal to Identify Religious Hate Content on Digital Social Networks
|
|
|
|
In: International Journal of Communication; Vol 16 (2022); 22 ; 1932-8036 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Constructive Aggression? Multiple Roles of Aggressive Content in Political Discourse on Russian YouTube
|
|
|
|
In: Media and Communication ; 9 ; 1 ; 181-194 ; Dark Participation in Online Communication: The World of the Wicked Web (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Detecting weak and strong Islamophobic hate speech on social media
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Análisis del discurso de odio en función de la ideología: Efectos emocionales y cognitivos
|
|
|
|
In: Comunicar: Revista científica iberoamericana de comunicación y educación, ISSN 1134-3478, Nº 71, 2022 (Ejemplar dedicado a: Discursos de odio en comunicación: Investigaciones y propuestas), pags. 37-48 (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Discurso de odio y aceptación social hacia migrantes en Europa: Análisis de tuits con geolocalización
|
|
|
|
In: Comunicar: Revista científica iberoamericana de comunicación y educación, ISSN 1134-3478, Nº 71, 2022 (Ejemplar dedicado a: Discursos de odio en comunicación: Investigaciones y propuestas), pags. 21-35 (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Motivos del discurso de odio en la adolescencia y su relación con las normas sociales
|
|
|
|
In: Comunicar: Revista científica iberoamericana de comunicación y educación, ISSN 1134-3478, Nº 71, 2022 (Ejemplar dedicado a: Discursos de odio en comunicación: Investigaciones y propuestas), pags. 9-20 (2022)
|
|
BASE
|
|
Show details
|
|
12 |
Changing Counterspeech
|
|
|
|
In: Cleveland State Law Review (2021)
|
|
BASE
|
|
Show details
|
|
13 |
La haine, c'est les autres!
|
|
|
|
In: La haine en discours ; https://halshs.archives-ouvertes.fr/halshs-03088100 ; Nolwenn Lorenzi; Claudine Moïse. La haine en discours, Éd. Le Bord de l’eau, pp.45-71, 2021, La haine en discours, 9782356877437 ; https://www.editionsbdl.com/produit/la-haine-en-discours/ (2021)
|
|
BASE
|
|
Show details
|
|
14 |
La haine, c'est les autres!
|
|
|
|
In: La haine en discours ; https://halshs.archives-ouvertes.fr/halshs-03088100 ; Nolwenn Lorenzi. La haine en discours, Editions le bord de l'eau, pp.45-71, 2021, La haine en discours ; https://www.lagalerne.com/livre/17968667-la-haine-en-discours-lorenzi-bailly-n--le-bord-de-l-eau (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Je suis ému•e et je te haine
|
|
|
|
In: La haine en discours ; https://halshs.archives-ouvertes.fr/halshs-03088098 ; Nolwenn Lorenzi & Claudine Moïse. La haine en discours, Editions le Bord de l'eau, p.15-44, 2021, collection documents ; https://www.lagalerne.com/livre/17968667-la-haine-en-discours-lorenzi-bailly-n--le-bord-de-l-eau (2021)
|
|
BASE
|
|
Show details
|
|
16 |
Je suis ému•e et je te haine
|
|
|
|
In: La haine en discours ; https://halshs.archives-ouvertes.fr/halshs-03088098 ; Nolwenn Lorenzi; Claudine Moïse. La haine en discours, Éd. Le Bord de l’eau, pp.15-44, 2021, 9782356877437 ; https://www.editionsbdl.com/produit/la-haine-en-discours/ (2021)
|
|
BASE
|
|
Show details
|
|
17 |
Emotionally Informed Hate Speech Detection: A Multi-target Perspective
|
|
|
|
In: ISSN: 1866-9956 ; EISSN: 1866-9964 ; Cognitive Computation ; https://hal.archives-ouvertes.fr/hal-03275549 ; Cognitive Computation, Springer, 2021, 13 (4), ⟨10.1007/s12559-021-09862-5⟩ ; https://link.springer.com/article/10.1007%2Fs12559-021-09862-5 (2021)
|
|
BASE
|
|
Show details
|
|
18 |
Multiword Expression Features for Automatic Hate Speech Detection
|
|
|
|
In: NLDB 2021 - 26th International Conference on Natural Language & Information Systems ; https://hal.archives-ouvertes.fr/hal-03231047 ; NLDB 2021 - 26th International Conference on Natural Language & Information Systems, Jun 2021, Saarbrücken/Virtual, Germany ; http://nldb2021.sb.dfki.de/ (2021)
|
|
BASE
|
|
Show details
|
|
19 |
Sociolinguistic resilience among young academics. A quantitative analysis in Germany and France
|
|
|
|
In: Economic Resilience in Regions and Organisations ; https://hal-univ-bourgogne.archives-ouvertes.fr/hal-03210458 ; Rüdiger Wink. Economic Resilience in Regions and Organisations, Springer, 2021, 978-3-658-33078-1. ⟨10.1007/978-3-658-33079-8_12⟩ ; https://www.springer.com/gp/book/9783658330781 (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Hate speech and offensive language detection using transfer learning approaches ; Détection du discours de haine et du langage offensant utilisant des approches de Transfer Learning
|
|
|
|
In: https://tel.archives-ouvertes.fr/tel-03276023 ; Document and Text Processing. Institut Polytechnique de Paris, 2021. English. ⟨NNT : 2021IPPAS007⟩ (2021)
|
|
BASE
|
|
Show details
|
|
|
|