DE eng

Search in the Catalogues and Directories

Page: 1 2 3
Hits 1 – 20 of 47

1
A comparative study of different features for efficient automatic hate speech detection
In: IPrA 2021 - 17th International Pragmatics Conference ; https://hal.archives-ouvertes.fr/hal-03115781 ; IPrA 2021 - 17th International Pragmatics Conference, Jun 2021, Winterthur, Switzerland (2021)
Abstract: International audience ; Commonly, Hate Speech (HS) is defined as any communication that disparages a person or agroup on the basis of some characteristic (race, colour, ethnicity, gender, sexual orientation, na-tionality, etc. (Nockeby, 2000)). Due to the massive activities of user-generator on social networks(around 500 million tweets per day) Hate Speech is continuously increasing on the web.Recent initiatives, such as SemEval2019 shared task 5 Hateval2019 (Basile et al., 2019) contri-bute to the development of automatic hate speech detection systems (HSD) by making availableannotated hateful corpus. We focus our research on automatic classification of hateful tweets,which are the first sub-task of Hateval2019. The best Hateval2019 HSD system was FERMI (In-durthi et al., 2019) with 65.1 % macro-F1 score on the test corpus. This system used sentenceembeddings, Universal Sentence Encoder (USE) (Cer et al., 2018) as input of a Support VectorMachine classifier.In this article, we study the impact of different features on an HSD system. We use deep neu-ral network (DNN) based classifier with USE. We investigate the word level features, such aslexicon of hateful words (HFW), Part of Speech (POS), uppercase letters (UP), punctuationmarks (PUNCT), the ratio of the number of times a word appears in hateful tweets comparedto the total number of times that word appears (RatioHW) ; and the emojis (EMO). We think thatthese features are relevant because they carry feelings. For instance, cases (UP) and punctuations(PUNCT) can carry the intonation of the tweets and can be used to express a hateful content. ForHFW features, we tag each word of tweets as hateful or not using the Hatebase lexicon (Hate-base.org) and we associate a binary value to each word. For POS features, we use twpipe (Liu etal., 2018) for tagging the words and this information is coded as an one-hot vector. For emojis,we generate an embedding vector using emoji2vec tools (Eisner et al., 2016). The input of ourneural network consists of the USE vector and our additional features. We used convolutionalneural networks (CNN) as binary classifier. We performed the experiments on the HateEval2019corpus to study the influence of each proposed feature. Our baseline system without proposedfeatures achieves 65.7% of macro-F1 score on the test corpus. Surprisingly, HFW degrades thesystem performance and decreases the macro-F1 by 14 points compared to the baseline. Thiscan be due to the fact that some words are hateful only in a particular context. UP, RatioHWand PUNCT slightly degrade the baseline system. The POS features do not change the baselinesystem result and so are probably not correlated to the hate speech. The best result is obtainedusing EMO features with 66.0% of macro-F1. EMOs are largely used to transmit emotions. Inour system,they are modeled by a specific embedding vector. USE does not take into account theemojis. Therefore, EMOs give additional information to USE about the hateful content of tweets.
Keyword: [INFO.INFO-SI]Computer Science [cs]/Social and Information Networks [cs.SI]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [INFO]Computer Science [cs]
URL: https://hal.archives-ouvertes.fr/hal-03115781/file/CFP___Offensive_language_on_social_media___International_Pragmatics_Conference_panel.pdf
https://hal.archives-ouvertes.fr/hal-03115781/document
https://hal.archives-ouvertes.fr/hal-03115781
BASE
Hide details
2
Multiword Expression Features for Automatic Hate Speech Detection
In: NLDB 2021 - 26th International Conference on Natural Language & Information Systems ; https://hal.archives-ouvertes.fr/hal-03231047 ; NLDB 2021 - 26th International Conference on Natural Language & Information Systems, Jun 2021, Saarbrücken/Virtual, Germany ; http://nldb2021.sb.dfki.de/ (2021)
BASE
Show details
3
BERT-based Semantic Model for Rescoring N-best Speech Recognition List
In: INTERSPEECH 2021 ; https://hal.archives-ouvertes.fr/hal-03248881 ; INTERSPEECH 2021, Aug 2021, Brno, Czech Republic ; https://www.interspeech2021.org/ (2021)
BASE
Show details
4
Improving Automatic Hate Speech Detection with Multiword Expression Features ...
BASE
Show details
5
Introduction of semantic model to help speech recognition
In: TSD 2020 - Twenty-third International Conference on Text, Speech and Dialogue ; https://hal.archives-ouvertes.fr/hal-02862245 ; TSD 2020 - Twenty-third International Conference on Text, Speech and Dialogue, Sep 2020, Brno, Czech Republic (2020)
BASE
Show details
6
Introduction d’informations sémantiques dans un système de reconnaissance de la parole
In: Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole ; 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole ; https://hal.archives-ouvertes.fr/hal-02798559 ; 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 1 : Journées d'Études sur la Parole, 2020, Nancy, France. pp.362-369 (2020)
BASE
Show details
7
RNN Language Model Estimation for Out-of-Vocabulary Words
In: Lecture Notes in Artificial Intelligence ; https://hal.archives-ouvertes.fr/hal-03054936 ; Lecture Notes in Artificial Intelligence, Springer, In press, 12598, ⟨10.1007/978-3-030-66527-2_15⟩ (2020)
BASE
Show details
8
DNN-Based Semantic Model for Rescoring N-best Speech Recognition List ...
Fohr, Dominique; Illina, Irina. - : arXiv, 2020
BASE
Show details
9
Dynamic Extension of ASR Lexicon Using Wikipedia Data
In: IEEE Workshop on Spoken and Language Technology (SLT) ; https://hal.archives-ouvertes.fr/hal-01874495 ; IEEE Workshop on Spoken and Language Technology (SLT), Dec 2018, Athènes, Greece (2018)
BASE
Show details
10
Modelling Semantic Context of OOV Words in Large Vocabulary Continuous Speech Recognition
In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.inria.fr/hal-01461617 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2017, 25 (3), pp.598 - 610. ⟨10.1109/TASLP.2017.2651361⟩ (2017)
BASE
Show details
11
Topic segmentation in ASR transcripts using bidirectional rnns for change detection
In: ASRU 2017 - IEEE Automatic Speech Recognition and Understanding Workshop ; https://hal.archives-ouvertes.fr/hal-01599682 ; ASRU 2017 - IEEE Automatic Speech Recognition and Understanding Workshop, Dec 2017, Okinawa, Japan (2017)
BASE
Show details
12
Out-of-Vocabulary Word Probability Estimation using RNN Language Model
In: 8th Language & Technology Conference ; https://hal.archives-ouvertes.fr/hal-01623784 ; 8th Language & Technology Conference, Nov 2017, Poznan, Poland (2017)
BASE
Show details
13
How Diachronic Text Corpora Affect Context based Retrieval of OOV Proper Names for Audio News
In: LREC 2016 ; https://hal.archives-ouvertes.fr/hal-01331714 ; LREC 2016, May 2016, Portoroz, Slovenia (2016)
BASE
Show details
14
Improved Neural Bag-of-Words Model to Retrieve Out-of-Vocabulary Words in Speech Recognition
In: INTERSPEECH 2016 ; https://hal.archives-ouvertes.fr/hal-01384488 ; INTERSPEECH 2016, Sep 2016, San Francisco, United States. ⟨10.21437/Interspeech.2016-1219⟩ (2016)
BASE
Show details
15
Temporal and Lexical Context of Diachronic Text Documents for Automatic Out-Of-Vocabulary Proper Name Retrieval
In: Human Language Technology. Challenges for Computer Science and Linguistics ; https://hal.inria.fr/hal-01475080 ; Zygmunt Vetulani; Hans Uszkoreit; Marek Kubis Human Language Technology. Challenges for Computer Science and Linguistics, 9561, Springer, pp.41-54, 2016, Lecture Notes in Computer Science, 978-3-319-43808-5. ⟨10.1007/978-3-319-43808-5_4⟩ (2016)
BASE
Show details
16
Dynamic adjustment of language models for automatic speech recognition using word similarity
In: IEEE Workshop on Spoken Language Technology (SLT 2016) ; https://hal.archives-ouvertes.fr/hal-01384365 ; IEEE Workshop on Spoken Language Technology (SLT 2016), Dec 2016, San Diego, CA, United States ; http://www.slt2016.org/ (2016)
BASE
Show details
17
Document Level Semantic Context for Retrieving OOV Proper Names
In: 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-01331716 ; 2016 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) , Mar 2016, Shanghai, China. pp.6050-6054, ⟨10.1109/ICASSP.2016.7472839⟩ (2016)
BASE
Show details
18
OOV Proper Name Retrieval using Topic and Lexical Context Model
In: IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01184963 ; IEEE International Conference on Acoustics, Speech and Signal Processing, 2015, Brisbane, Australia (2015)
BASE
Show details
19
Continuous Word Representation using Neural Networks for Proper Name Retrieval from Diachronic Documents
In: Interspeech 2015 ; https://hal.archives-ouvertes.fr/hal-01184951 ; Interspeech 2015, Sep 2015, Dresden, Germany (2015)
BASE
Show details
20
Neural Networks Revisited for Proper Name Retrieval from Diachronic Documents
In: proceedings of LTC2015 ; LTC Language & Technology Conference ; https://hal.archives-ouvertes.fr/hal-01240480 ; LTC Language & Technology Conference, Nov 2015, Poznan, Poland. pp.120-124 (2015)
BASE
Show details

Page: 1 2 3

Catalogues
0
0
3
0
0
0
0
Bibliographies
3
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
44
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern