2 |
Croatian SenseGraph 1.0
|
|
|
|
Abstract:
SenseGraph a graph-like structure of word senses of most common words of the standard Croatian language, obtained by relying on human-provided lexical substitutes for target words in context. SenseGraph is encoded in the Lexical Markup Framework (LMF; ISO 24613:2008) format. SenseGraphs consists of SenseCells, which are clusters of same-sense words obtained by grouping of words based on the similarity of their lexical substitution sets and the contexts they appear in. SenseCells can be thought of as Synsets in standard computational lexicographic terminology, albeit they exhibit more variability, which can be attributed to sense modulations in specific contexts. SenseCells are linked to each other based on loose semantic relatedness. In total, the resource covers 649 Croatian words across three different part-of-speech tags: nouns, verbs, and adjectives. More specifically, the resource contains 4,172 sentences across 230 nouns, 3,288 sentences across 200 verbs, and 4,116 sentences across 219 adjectives. Those sentences were then clustered using a lexical-substitution-based clustering method, yielding 2,877 synsets. The sentences were sampled from the SETimes.HR and hrWaC corpora. Total number of sentences: 11,576 Total number of syncells: 2,877 Total number of words: 649
|
|
Keyword:
Croatian language; lexical database; lexical substitutes; semantic lexicon
|
|
URL: http://hdl.handle.net/11356/1218
|
|
BASE
|
|
Hide details
|
|
3 |
Annotated corpora and tools of the PARSEME Shared Task on Automatic Identification of Verbal Multiword Expressions (edition 1.1)
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Two layers of annotation for representing event mentions in news stories
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Predictability of Distributional Semantics in Derivational Word Formation
|
|
|
|
In: Proceedings of COLING 2016, the 26th International Conference on Computational Linguistics : Technical Papers P. 1285–1296 (2016)
|
|
BASE
|
|
Show details
|
|
10 |
Event-centered information retrieval using kernels on event graphs
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Aspect-oriented opinion mining from user reviews in Croatian
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Exploring coreference uncertainty of generically extracted event mentions
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Experiments on hybrid corpus-based sentiment lexicon acquisition
|
|
|
|
BASE
|
|
Show details
|
|
|
|