DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...52
Hits 1 – 20 of 1.040

1
Eine agentenbasierte Architektur für Programmierung mit gesprochener Sprache
Weigelt, Sebastian. - : KIT Scientific Publishing, Karlsruhe, 2022
BASE
Show details
2
CorpusExplorer ; Eine Software zur korpuspragmatischen Analyse
BASE
Show details
3
Generación de textos en ruso mediante técnicas de Aprendizaje Automático para la industria del lenguaje
Gregoryev, Mykyta. - : Universitat Politècnica de València, 2022
BASE
Show details
4
Measuring the quality of unstructured text in routinely collected electronic health data: a review and application
Nesca, Marcello. - 2022
BASE
Show details
5
Statistics in corpus linguistics : a new approach
Wallis, Sean. - London : Routledge, 2021
BLLDB
UB Frankfurt Linguistik
Show details
6
Deep learning and linguistic representation
Lappin, Shalom. - New York : CRC Press, Taylor & Francis Group, 2021
BLLDB
UB Frankfurt Linguistik
Show details
7
CorpusExplorer ... : Eine Software zur korpuspragmatischen Analyse ...
Rüdiger, Jan Oliver. - : Universität Kassel, 2021
BASE
Show details
8
Metaphor processing in tweets
Zayed, Omnia. - : NUI Galway, 2021
BASE
Show details
9
Exploring Construction of a Company Domain-Specific Knowledge Graph from Financial Texts Using Hybrid Information Extraction
Jen, Chun-Heng. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
10
Semantic annotation and summarization of biomedical text ...
Reeve, Lawrence H.. - : Drexel University, 2021
BASE
Show details
11
NLP-Assisted Workflow Improving Bug Ticket Handling
Eriksson, Caroline; Kallis, Emilia. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
12
Improving Multilingual Models for the Swedish Language : Exploring CrossLingual Transferability and Stereotypical Biases
Katsarou, Styliani. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
13
DistillaBSE: Task-agnostic distillation of multilingual sentence embeddings : Exploring deep self-attention distillation with switch transformers
Bubla, Boris. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
14
Smart Auto-completion in Live Chat Utilizing the Power of T5 ; Smart automatisk komplettering i livechatt som utnyttjar styrkan hos T5
Wang, Zhanpeng. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
15
Detecting Signal Corruptions in Voice Recordings for Speech Therapy ; Igenkänning av Signalproblem i Röstinspelningar för Logopedi
Nylén, Helmer. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
16
Extending a Text Classifier to Multiple Languages ; Utöka en textklassificeringsmodell till flera språk
Byström, Albin. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
BASE
Show details
17
Multimodal Semantic Understanding and Navigation in Outdoor Scenes
Vasudevan, Arun Balajee. - : ETH Zurich, 2021
BASE
Show details
18
Leveraging Cognitive Processing Signals for Natural Language Understanding
Hollenstein, Nora. - : ETH Zurich, 2021
BASE
Show details
19
Essays on Representation Learning for Political Science Research
Wu, Patrick. - 2021
BASE
Show details
20
Analyzing Non-Textual Content Elements to Detect Academic Plagiarism
Abstract: Identifying academic plagiarism is a pressing problem, among others, for research institutions, publishers, and funding organizations. Detection approaches proposed so far analyze lexical, syntactical, and semantic text similarity. These approaches find copied, moderately reworded, and literally translated text. However, reliably detecting disguised plagiarism, such as strong paraphrases, sense-for-sense translations, and the reuse of non-textual content and ideas, is an open research problem. The thesis addresses this problem by proposing plagiarism detection approaches that implement a different concept—analyzing non-textual content in academic documents, such as citations, images, and mathematical content. The thesis makes the following research contributions. It provides the most extensive literature review on plagiarism detection technology to date. The study presents the weaknesses of current detection approaches for identifying strongly disguised plagiarism. Moreover, the survey identifies a significant research gap regarding methods that analyze features other than text. Subsequently, the thesis summarizes work that initiated the research on analyzing non-textual content elements to detect academic plagiarism by studying citation patterns in academic documents. To enable plagiarism checks of figures in academic documents, the thesis introduces an image-based detection process that adapts itself to the forms of image similarity typically found in academic work. The process includes established image similarity assessments and newly proposed use-case-specific methods. To improve the identification of plagiarism in disciplines like mathematics, physics, and engineering, the thesis presents the first plagiarism detection approach that analyzes the similarity of mathematical expressions. To demonstrate the benefit of combining non-textual and text-based detection methods, the thesis describes the first plagiarism detection system that integrates the analysis of citation-based, image-based, math-based, and text-based document similarity. The system’s user interface employs visualizations that significantly reduce the effort and time users must invest in examining content similarity. To validate the effectiveness of the proposed detection approaches, the thesis presents five evaluations that use real cases of academic plagiarism and exploratory searches for unknown cases. Real plagiarism is committed by expert researchers with strong incentives to disguise their actions. Therefore, I consider the ability to identify such cases essential for assessing the benefit of any new plagiarism detection approach. The findings of these evaluations are as follows. Citation-based plagiarism detection methods considerably outperformed text-based detection methods in identifying translated, paraphrased, and idea plagiarism instances. Moreover, citation-based detection methods found nine previously undiscovered cases of academic plagiarism. The image-based plagiarism detection process proved effective for identifying frequently observed forms of image plagiarism for image types that authors typically use in academic documents. Math-based plagiarism detection methods reliably retrieved confirmed cases of academic plagiarism involving mathematical content and identified a previously undiscovered case. Math-based detection methods offered advantages for identifying plagiarism cases that text-based methods could not detect, particularly in combination with citation-based detection methods. These results show that non-textual content elements contain a high degree of semantic information, are language-independent, and largely immutable to the alterations that authors typically perform to conceal plagiarism. Analyzing non-textual content complements text-based detection approaches and increases the detection effectiveness, particularly for disguised forms of academic plagiarism. ; published
Keyword: Citation Analysis; Content-based Image Retrieval; Data mining; ddc:004; Digital libraries and archives; Document representation; Evaluation of retrieval results; Image search; Information extraction; Information integration; Information Visualization; Link and co-citation analysis; Math Retrieval; Mathematics retrieval; Multilingual and cross-lingual retrieval; Natural Language Processing; Near-duplicate and plagiarism detection; Open Source Software; Plagiarism Detection; Retrieval models and ranking; Surveys and overviews; User Interaction; Users and interactive retrieval; Web searching and information discovery; Web-based interaction
URL: https://doi.org/10.5281/zenodo.4913345
http://nbn-resolving.de/urn:nbn:de:bsz:352-2-ll951b8bh8s30
BASE
Hide details

Page: 1 2 3 4 5...52

Catalogues
432
0
0
0
0
16
2
Bibliographies
441
9
0
0
0
0
0
30
109
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
438
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern