1 |
What Motivates You? Benchmarking Automatic Detection of Basic Needs from Short Posts ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Five Psycholinguistic Characteristics for Better Interaction with Users ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
UH-PRHLT at SemEval-2016 Task 3: Combining Lexical and Semantic-based Features for Community Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
A Resource-Light Method for Cross-Lingual Semantic Textual Similarity ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
A Low Dimensionality Representation for Language Variety Identification
|
|
|
|
BASE
|
|
Show details
|
|
6 |
A resource-light method for cross-lingual semantic textual similarity
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Semantically-informed distance and similarity measures for paraphrase plagiarism identification
|
|
|
|
BASE
|
|
Show details
|
|
8 |
A Cross-domain and Cross-language Knowledge-based Representation of Text and its Meaning
|
|
|
|
BASE
|
|
Show details
|
|
9 |
A Systematic Study of Knowledge Graph Analysis for Cross-language Plagiarism Detection
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Language variety identification using distributed representations of words and documents
|
|
|
|
Abstract:
The final publication is available at Springer via http://dx.doi.org/10.1007/978-3-319-24027-5_3 ; In this work we focus on the use of distributed representations of words and documents using the continuous Skip-gram model. We compare this model with three recent approaches: Information Gain Word-Patterns, TF-IDF graphs and Emotion-labeled Graphs, in addition to several baselines. We evaluate the models introducing the Hispablogs dataset, a new collection of Spanish blogs from five different countries: Argentina, Chile, Mexico, Peru and Spain. Experimental results show state-of-the-art performance in language variety identification. ; This research has been carried out within the framework of the European Commis-sion WIQ-EI IRSES (no. 269180) and DIANA - Finding Hidden Knowledge in Texts (TIN2012-38603-C02) projects. The work of the second author was partially funded by Autoritas Consulting SA and by Spanish the Ministry of Economics by means of a ECOPORTUNITY IPT-2012-1220-430000 grant. ; Franco Salvador, M.; Rangel, F.; Rosso, P.; Taulé, M.; Martí, MA. (2015). Language variety identification using distributed representations of words and documents. En Experimental IR Meets Multilinguality, Multimodality, and Interaction: 6th International Conference of the CLEF Association, CLEF'15, Toulouse, France, September 8-11, 2015, Proceedings. Springer International Publishing. 28-40. https://doi.org/10.1007/978-3-319-24027-5_3 ; S ; 28 ; 40 ; Barto, A.G.: Reinforcement learning: An introduction. MIT press (1998) ; Bengio, Y., Ducharme, R., Vincent, P., Janvin, C.: A neural probabilistic language model. The Journal of Machine Learning Research 3, 1137–1155 (2003) ; Dumais, S.T.: Latent semantic analysis. Annual Review of Information Science and Technology 38(1), 188–230 (2004) ; Gutmann, M.U., Hyvärinen, A.: Noise-contrastive estimation of unnormalized statistical models, with applications to natural image statistics. The Journal of Machine Learning Research 13(1), 307–361 (2012) ; Hinton, G.E., McClelland, J.L., Rumelhart, D.E.: Distributed representations. In: Rumelhart, D.E., McClelland, J.L., (eds.) Parallel Distributed Processing: Explorations in the Microstructure of Cognition. MIT Press (1986) ; Kim, Y.: Convolutional neural networks for sentence classification. In: Proceedings of the International Conference on Empirical Methods in Natural Language Processing (2014) ; Le, Q.V., Mikolov, T.: Distributed representations of sentences and documents. In: Proceedings of the 31st International Conference on Machine Learning (2014) ; Levin, B.: English verb classes and alternations. University of Chicago Press, Chicago (1993) ; Maier, W., Gómez-Rodríguez, C.: Language variety identification in Spanish tweets. In: Proceedings of the EMNLP’2014 Workshop on Language Technology for Closely Related Languages and Language Variants, pp. 25–35. Association for Computational Linguistics, Doha, Qatar, October 2014. http://emnlp2014.org/workshops/LT4CloseLang/call.html ; Martí, M.A., Bertran, M., Taulé, M., Salamó, M.: Distributional approach based on syntactic dependencies for discovering constructions. Computational Linguistics (2015, under review) ; Mikolov, T., Chen, K., Corrado, G., Dean, J.: Efficient estimation of word representations in vector space. In: Proceedings of Workshop at International Conference on Learning Representations (2013) ; Mikolov, T., Karafiát, M., Burget, L., Cernockỳ, J., Khudanpur, S.: Recurrent neural network based language model. In: INTERSPEECH 2010, 11th Annual Conference of the International Speech Communication Association, Makuhari, Chiba, Japan, pp. 1045–1048, September 26–30, 2010 ; Mikolov, T., Sutskever, I., Chen, K., Corrado, G.S., Dean, J.: Distributed representations of words and phrases and their compositionality. In: Advances in Neural Information Processing Systems, vol. 26, pp. 3111–3119 (2013) ; Mnih, A., Teh, Y.W.: A fast and simple algorithm for training neural probabilistic language models. arXiv preprint arXiv:1206.6426 (2012) ; Mohammad, S.M., Yang, T.: Tracking sentiment in mail: how gender differ on emotional axes. In: Proceedings of the 2nd Workshop on Computational Approaches to Subjectivity and Sentiment Analysis (2011) ; Morin, F., Bengio, Y.: Hierarchical probabilistic neural network language model. In: Proceedings of the International Workshop on Artificial Intelligence and Statistics, pp. 246–252. Citeseer (2005) ; Pennebaker, J.W.: The secret life of pronouns: What our words say about us. Bloomsbury Press (2011) ; Rangel, F., Rosso, P.: On the impact of emotions on author profiling. Information Processing & Management, Special Issue on Emotion and Sentiment in Social and Expressive Media (2015, in press) ; Rangel, F., Rosso, P., Chugur, I., Potthast, M., Trenkmann, M., Stein, B., Verhoeven, B., Daelemans, W.: Overview of the 2nd author profiling task at pan 2014. In: Cappellato, L., Ferro, N., Halvey, M., Kraaij, W. (eds.) CLEF 2014 Labs and Workshops, Notebook Papers. CEUR-WS.org, vol. 1180 (2014) ; Rangel, F., Rosso, P., Koppel, M., Stamatatos, E., Inches, G.: Overview of the author profiling task at pan 2013. In: Forner P., Navigli R., Tufis, D. (eds.) Notebook Papers of CLEF 2013 LABs and Workshops. CEUR-WS.org, vol. 1179 (2013) ; Sadat, F., Kazemi, F., Farzindar, A.: Automatic identification of arabic language varieties and dialects in social media. In: Proceeding of the 1st International Workshop on Social Media Retrieval and Analysis SoMeRa (2014) ; Salton, G., Wong, A., Yang, C.S.: A vector space model for automatic indexing. Communications of the ACM 18(11), 613–620 (1975) ; Sidorov, G., Miranda-Jimnez, S., Viveros-Jimnez, F., Gelbukh, F., Castro-Snchez, N., Velsquez, F., Daz-Rangel, I., Surez-Guerra, S., Trevio, A., Gordon-Miranda, J.: Empirical study of opinion mining in spanish tweets. In: 11th Mexican International Conference on Artificial Intelligence, MICAI, pp. 1–4 (2012) ; Zampieri, M., Gebrekidan-Gebre, B.: Automatic identification of language varieties: the case of portuguese. In: Proceedings of the Conference on Natural Language Processing (2012)
|
|
Keyword:
Author profiling; Distributed representations; Emotion-labeled Graphs; Information Gain Word-Patterns; Language variety identification; LENGUAJES Y SISTEMAS INFORMATICOS; TF-IDF graphs
|
|
URL: https://doi.org/10.1007/978-3-319-24027-5_3 http://hdl.handle.net/10251/64372
|
|
BASE
|
|
Hide details
|
|
12 |
Cross-domain polarity classification using a knowledge-enhanced meta-classifier
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Detección de plagio translingüe utilizando una red semántica multilingüe
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Análisis de similitud basado en grafos: una nueva aproximación a la detección de plagio translingüe ; Graph-based similarity analysis: a new approach to cross-language plagiarism detection
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Knowledge Graphs as Context Models: Improving the Detection of Cross-Language Plagiarism with Paraphrasing
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Cross-language plagiarism detection using multilingual semantic network
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Análisis de similitud basado en grafos: Una nueva aproximación a la detección de plagio translingüe ; Graph-Based Similarity Analysis: A New Approach to Cross-Language Plagiarism Detection
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Detección de plagio translingüe utilizando el diccionario estadístico de BabelNet ; Cross-language Plagiarism Detection Using BabelNet’s Statistical Dictionary
|
|
|
|
BASE
|
|
Show details
|
|
|
|