DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5 6
Hits 81 – 100 of 106

81
Explicit retrofitting of distributional word vectors
Glavaš, Goran; Vulić, Ivan. - : Association for Computational Linguistics, 2018
BASE
Show details
82
A resource-light method for cross-lingual semantic textual similarity
Abstract: [EN] Recognizing semantically similar sentences or paragraphs across languages is beneficial for many tasks, ranging from cross-lingual information retrieval and plagiarism detection to machine translation. Recently proposed methods for predicting cross-lingual semantic similarity of short texts, however, make use of tools and resources (e.g., machine translation systems, syntactic parsers or named entity recognition) that for many languages (or language pairs) do not exist. In contrast, we propose an unsupervised and a very resource-light approach for measuring semantic similarity between texts in different languages. To operate in the bilingual (or multilingual) space, we project continuous word vectors (i.e., word embeddings) from one language to the vector space of the other language via the linear translation model. We then align words according to the similarity of their vectors in the bilingual embedding space and investigate different unsupervised measures of semantic similarity exploiting bilingual embeddings and word alignments. Requiring only a limited-size set of word translation pairs between the languages, the proposed approach is applicable to virtually any pair of languages for which there exists a sufficiently large corpus, required to learn monolingual word embeddings. Experimental results on three different datasets for measuring semantic textual similarity show that our simple resource-light approach reaches performance close to that of supervised and resource-intensive methods, displaying stability across different language pairs. Furthermore, we evaluate the proposed method on two extrinsic tasks, namely extraction of parallel sentences from comparable corpora and cross-lingual plagiarism detection, and show that it yields performance comparable to those of complex resource-intensive state-of-the-art models for the respective tasks. (C) 2017 Published by Elsevier B.V. ; Part of the work presented in this article was performed during second author's research visit to the University of Mannheim, supported by Contact Fellowship awarded by the DAAD scholarship program "STIBET Doktoranden". The research of the last author has been carried out in the framework of the SomEMBED project (TIN2015-71147-C2-1-P). Furthermore, this work was partially funded by the Junior-professor funding programme of the Ministry of Science, Research and the Arts of the state of Baden-Wurttemberg (project "Deep semantic models for high-end NLP application"). ; Glavas, G.; Franco-Salvador, M.; Ponzetto, SP.; Rosso, P. (2018). A resource-light method for cross-lingual semantic textual similarity. Knowledge-Based Systems. 143:1-9. https://doi.org/10.1016/j.knosys.2017.11.041 ; S ; 1 ; 9 ; 143
Keyword: Cross-lingual Word embeddings; LENGUAJES Y SISTEMAS INFORMATICOS; Plagiarism detection; Semantic textual similarity; Word alignment Parallel sentences alignment
URL: http://hdl.handle.net/10251/146277
https://doi.org/10.1016/j.knosys.2017.11.041
BASE
Hide details
83
Unsupervised cross-lingual scaling of political texts
Nanni, Federico; Ponzetto, Simone Paolo; Glavaš, Goran. - : Association for Computational Linguistics, 2017
BASE
Show details
84
University of Mannheim @ CLSciSumm-17: Citation-Based Summarization of Scientific Articles Using Semantic Textual Similarity
BASE
Show details
85
Cross-lingual classification of topics in political texts
Ponzetto, Simone Paolo; Nanni, Federico; Glavaš, Goran. - : Association for Computational Linguistics (ACL), 2017
BASE
Show details
86
Improving neural knowledge base completion with cross-lingual projections
Klein, Patrick; Glavaš, Goran; Ponzetto, Simone Paolo. - : Association for Computational Linguistics, 2017
BASE
Show details
87
Leveraging event-based semantics for automated text simplification
Štajner, Sanja; Glavaš, Goran. - : Elsevier, 2017
BASE
Show details
88
Two layers of annotation for representing event mentions in news stories
Buono, Maria Pia di; Tutek, Martin; Šnajder, Jan. - : Association for Computational Linguistics, 2017
BASE
Show details
89
If sentences could see: Investigating visual information for semantic textual similarity
Glavaš, Goran; Vulić, Ivan; Ponzetto, Simone Paolo. - : Association for Computational Linguistics, 2017
BASE
Show details
90
Dual tensor model for detecting asymmetric lexico-semantic relations
Glavaš, Goran; Ponzetto, Simone Paolo. - : Association for Computational Linguistics, 2017
BASE
Show details
91
Predicting news values from headline text and emotions
Buono, Maria Pia di; Šnajder, Jan; Dalbelo Bašić, Bojana. - : Association for Computational Linguistics, 2017
BASE
Show details
92
Unsupervised text segmentation using semantic relatedness graphs
Glavaš, Goran; Nanni, Federico; Ponzetto, Simone Paolo. - : Association for Computational Linguistics, 2016
BASE
Show details
93
Spanish NER with word representations and conditional random fields
Copara Zea, Jenny Linet; Ochoa Luna, José Eduardo; Thorne, Camilo. - : Association for Computational Linguistics, 2016
BASE
Show details
94
Capturing interdisciplinarity in academic abstracts
Nanni, Federico; Dietz, Laura; Faralli, Stefano. - : Corporation for National Research Initiatives, 2016
BASE
Show details
95
Simplifying lexical simplification: do we need simplified corpora?
Glavaš, Goran; Štajner, Sanja. - : Curran, 2015
BASE
Show details
96
TKLBLIIR: Detecting Twitter paraphrases with TweetingJay
Karan, Mladen; Glavaš, Goran; Šnajder, Jan. - : Association for Computational Linguistics, 2015
BASE
Show details
97
TAKELAB: Medical information extraction and linking with MINERAL
Glavaš, Goran. - : Association for Computational Linguistics, 2015
BASE
Show details
98
Constructing coherent event hierarchies from news stories
Glavaš, Goran; Šnajder, Jan. - : Association for Computational Linguistics, 2014
BASE
Show details
99
Event-centered simplication of news stories
Štajner, Sanja; Glavaš, Goran. - : Association for Computational Linguistics, 2013
BASE
Show details
100
Recognizing identical events with graph kernels
Glavaš, Goran; Šnajder, Jan. - : Association for Computational Linguistics, 2013
BASE
Show details

Page: 1 2 3 4 5 6

Catalogues
Bibliographies
Linked Open Data catalogues
Online resources
Open access documents
106
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern