Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	A Systematic Study of Knowledge Graph Analysis for Cross-language Plagiarism Detection
	Franco-Salvador, Marc; Rosso, Paolo; Montes Gomez, Manuel. - : Elsevier, 2016
	BASE
	Show details

2	Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language
	Franco-Salvador, Marc; Gupta, Parth Alokkumar; Rosso, Paolo; Banchs, Rafael. - : Elsevier, 2016
	Abstract: This is the author’s version of a work that was accepted for publication in Knowledge-Based Systems. Changes resulting from the publishing process, such as peer review, editing, corrections, structural formatting, and other quality control mechanisms may not be reflected in this document. Changes may have been made to this work since it was submitted for publication. A definitive version was subsequently published in Knowledge-Based Systems 111 (2016) 87–99. DOI 10.1016/j.knosys.2016.08.004. ; Cross-language (CL) plagiarism detection aims at detecting plagiarised fragments of text among documents in different languages. The main research question of this work is on whether knowledge graph representations and continuous space representations can complement to each other and improve the state-of-the-art performance in CL plagiarism detection methods. In this sense, we propose and evaluate hybrid models to assess the semantic similarity of two segments of text in different languages. The proposed hybrid models combine knowledge graph representations with continuous space representations aiming at exploiting their complementarity in capturing different aspects of cross-lingual similarity. We also present the continuous word alignment-based similarity analysis, a new model to estimate similarity between text fragments. We compare the aforementioned approaches with several state-of-the-art models in the task of CL plagiarism detection and study their performance in detecting different length and obfuscation types of plagiarism cases. We conduct experiments over Spanish-English and GermanEnglish datasets. Experimental results show that continuous representations allow the continuous word alignment-based similarity analysis model to obtain competitive results and the knowledge-based document similarity model to outperform the state-of-the-art in CL plagiarism detection. © 2016 Elsevier B.V. All rights reserved. ; This research has been carried out in framework of the FPI-UPV pre-doctoral grant (No de registro - 3505) awarded to Parth Gupta and in the framework of the national projects DIANA-APPLICATIONS - Finding Hidden Knowledge in Texts: Applications (TIN2012-38603-C02-01), and SomEMBED: SOcial Media language understanding - EMBEDing contexts (TIN2015-71147-C2-1-P). We would like to thank Martin Potthast, Daniel Ortiz-Martinez, and Luis A. Leiva for their support and comments during this research. ; Franco-Salvador, M.; Gupta, PA.; Rosso, P.; Banchs, R. (2016). Cross-language Plagiarism Detection over Continuous-space- and Knowledge Graph-based Representations of Language. Knowledge-Based Systems. 111:87-99. https://doi.org/10.1016/j.knosys.2016.08.004 ; S ; 87 ; 99 ; 111
	Keyword: Continuous representations; Cross-language; Knowledge graphs; LENGUAJES Y SISTEMAS INFORMATICOS; Multilingual semantic network; Plagiarism detection
	URL: https://doi.org/10.1016/j.knosys.2016.08.004 http://hdl.handle.net/10251/82493
	BASE
	Hide details

Search in the Catalogues and Directories