DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6 7
Hits 21 – 40 of 124

21
MSIR@FIRE: A Comprehensive Report from 2013 to 2016
BASE
Show details
22
Fine-Grained Analysis of Language Varieties and Demographics
Rangel, Francisco; Rosso, Paolo; Zaghouani, Wajdi. - : Cambridge University Press, 2020
BASE
Show details
23
Multilingual Stance Detection in Social Media Political Debates
BASE
Show details
24
Fake Opinion Detection: How Similar are Crowdsourced Datasets to Real Data?
Fornaciari, Tommaso; Cagnina, Leticia; Rosso, Paolo. - : Springer-Verlag, 2020
BASE
Show details
25
FacTweet: Profiling Fake News Twitter Accounts
BASE
Show details
26
Overview of PAN 2020: Authorship Verification, Celebrity Profiling, Profiling Fake News Spreaders on Twitter, and Style Change Detection
BASE
Show details
27
The Role of Personality and Linguistic Patterns in Discriminating Between Fake News Spreaders and Fact Checkers
BASE
Show details
28
Scalable and Language-Independent Embedding-based Approach for Plagiarism Detection Considering Obfuscation Type: No Training Phase
Gharavi, Erfaneh; Veisi, Hadi; Rosso, Paolo. - : Springer-Verlag, 2020
BASE
Show details
29
Introduction to the Special Section on Computational Modeling and Understanding of Emotions in Conflictual Social Interactions
Rosso, Paolo; Clavel, Chloé; Damiano, Rossana. - : Association for Computing Machinery, 2020
BASE
Show details
30
Stance polarity in political debates: A diachronic perspective of network homophily and conversations on Twitter
BASE
Show details
31
IDAT@FIRE2019: Overview of the Track on Irony Detection in Arabic Tweets
Ghanem, Bilal; Karoui, Jihen; Benamara, Farah. - : CEUR-WS.org, 2019
BASE
Show details
32
On the Use of Character n-grams as the only Intrinsic Evidence of Plagiarism
Rosso, Paolo; Bensalem, Imene; Chikhi, Salim. - : Springer-Verlag, 2019
BASE
Show details
33
Online Hate Speech against Women: Automatic Identification of Misogyny and Sexism on Twitter
BASE
Show details
34
On the use of word embedding for cross language plagiarism detection
BASE
Show details
35
Overview of PAN 2019: Bots and Gender Profiling, Celebrity Profiling, Cross-domain Authorship Attribution and Style Change Detection
BASE
Show details
36
Improving Attitude Words Classification for Opinion Mining using Word Embedding
BASE
Show details
37
Classifier combination approach for question classification for Bengali question answering system
Banerjee, Somnath; Bndyopadhyay, Sivaji; Rosso, Paolo. - : Springer-Verlag, 2019
BASE
Show details
38
A Decade of Shared Tasks in Digital Text Forensics at PAN
BASE
Show details
39
Paraphrase Plagiarism Identifcation with Character-level Features
BASE
Show details
40
A Low Dimensionality Representation for Language Variety Identification
Abstract: [EN] Language variety identification aims at labelling texts in a native language (e.g. Spanish, Portuguese, English) with its specific variation (e.g. Argentina, Chile, Mexico, Peru, Spain; Brazil, Portugal; UK, US). In this work we propose a low dimensionality representation (LDR) to address this task with five different varieties of Spanish: Argentina, Chile, Mexico, Peru and Spain. We compare our LDR method with common state-of-the-art representations and show an increase in accuracy of ~35%. Furthermore, we compare LDR with two reference distributed representation models. Experimental results show competitive performance while dramatically reducing the dimensionality¿and increasing the big data suitability¿to only 6 features per variety. Additionally, we analyse the behaviour of the employed machine learning algorithms and the most discriminating features. Finally, we employ an alternative dataset to test the robustness of our low dimensionality representation with another set of similar languages. ; The work of the first author was in the framework of ECOPORTUNITY IPT-2012-1220-430000. The work of the last two authors was in the framework of the SomEMBED MINECO TIN2015-71147-C2-1-P research project. This work has been also supported by the SomEMBED TIN2015-71147-C2-1-P MINECO research project and by the Generalitat Valenciana under the grant ALMAPATER (PrometeoII/2014/030). ; Rangel-Pardo, FM.; Franco-Salvador, M.; Rosso, P. (2018). A Low Dimensionality Representation for Language Variety Identification. Lecture Notes in Computer Science. 9624:156-169. https://doi.org/10.1007/978-3-319-75487-1_13 ; S ; 156 ; 169 ; 9624 ; Franco-Salvador, M., Rangel, F., Rosso, P., Taulé, M., Antònia Martít, M.: Language variety identification using distributed representations of words and documents. In: Mothe, J., Savoy, J., Kamps, J., Pinel-Sauvagnat, K., Jones, G.J.F., SanJuan, E., Cappellato, L., Ferro, N. (eds.) CLEF 2015. LNCS, vol. 9283, pp. 28–40. Springer, Cham (2015). https://doi.org/10.1007/978-3-319-24027-5_3 ; Goodman, J.: Classes for fast maximum entropy training. In: Proceedings of the Acoustics, Speech, and Signal Processing (ICASSP 2001), vol. 1, pp. 561–564 (2001) ; Gutmann, M.U., Hyvärinen, A.: Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. J. Mach. Learn. Res. 13, 307–361 (2012) ; Hinton, G.E., Mcclelland, J.L., Rumelhart, D.E.: Distributed Representations, Parallel Distributed Processing: Explorations in the Microstructure of Cognition, Foundations, vol. 1. MIT Press, Cambridge (1986) ; Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (ICML 2014), vol. 32 (2014) ; Maier, W., Gómez-Rodríguez, C.: Language variety identification in Spanish tweets. In: Workshop on Language Technology for Closely Related Languages and Language Variants (EMNLP 2014), pp. 25–35 (2014) ; Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations (ICLR 2013) (2013) ; Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, pp. 3111–3119 (2013) ; Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models. In: Proceedings of the 29th International Conference on Machine Learning (ICML 2012), pp. 1751–1758 (2012) ; Sadat, F., Kazemi, F., Farzindar, A.: Automatic identification of Arabic language varieties and dialects in social media. In: 1st International Workshop on Social Media Retrieval and Analysis (SoMeRa 2014) (2014) ; Salton, G., Buckley, C.: Term-weighting approaches in automatic text retrieval. Inf. Process. Manag. 24(5), 513–523 (1988) ; Tan, L., Zampieri, M., Ljubešic, N., Tiedemann, J.: Merging comparable data sources for the discrimination of similar languages: the DSL corpus collection. In: 7th Workshop on Building and Using Comparable Corpora Building Resources for Machine Translation Research (BUCC 2014), pp. 6–10 (2014) ; Zampieri, M., Gebrekidan-Gebre, B.: Automatic identification of language varieties: the case of Portuguese. In: Proceedings of the 11th Conference on Natural Language Processing (KONVENS 2012), pp. 233–237 (2012) ; Zampieri, M., Tan, L., Ljubeši, N., Tiedemann, J.: A report on the DSL shared task 2014. In: Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (VarDial 2014), pp. 58–67 (2014)
Keyword: Author profiling; Big data; Language variety identification; LENGUAJES Y SISTEMAS INFORMATICOS; Low dimensionality representation; Similar languages discrimination; Social media
URL: http://hdl.handle.net/10251/146184
https://doi.org/10.1007/978-3-319-75487-1_13
BASE
Hide details

Page: 1 2 3 4 5 6 7

Catalogues
0
0
3
0
0
0
0
Bibliographies
1
0
0
0
0
0
1
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
120
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern