Home
Catalogue search
Refine your search:
Keyword:
Croatian (1)
Digital Humanities (1)
Slavic Languages and Societies (1)
TEI (1)
annotation (1)
computational linguistics (1)
dependency treebank (1)
linguistic resource (1)
machine learning (1)
manual annotation (1)
more
Creator / Publisher:
Agić, Željko (2)
Batanović, Vuk (2)
Erjavec, Tomaž (2)
Ljubešić, Nikola (2)
Klubicka, Filip (1)
Klubička, Filip (1)
Year:
2018 (2)
Medium
Type:
Article (2)
BLLDB-Access
Search in the Catalogues and Directories
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
AND
OR
AND NOT
All fields
Title
Creator / Publisher
Keyword
Year
Sort by
creator [A → Z]
'
creator [Z → A]
'
publishing year ↑ (asc)
'
publishing year ↓ (desc)
'
title [A → Z]
'
title [Z → A]
'
Simple Search
Hits 1 – 2 of 2
1
Training corpus hr500k 1.0
Ljubešić, Nikola
;
Agić, Željko
;
Klubička, Filip
. - : Jožef Stefan Institute, 2018
BASE
Show details
2
hr500k – A Reference Training Corpus of Croatian.
Erjavec, Tomaž
;
Ljubešić, Nikola
;
Klubicka, Filip
;
Agić, Željko
;
Batanović, Vuk
In: Conference papers (2018)
Abstract:
In this paper we present hr500k, a Croatian reference training corpus of 500 thousand tokens, segmented at document, sentence and word level, and annotated for morphosyntax, lemmas, dependency syntax, named entities, and semantic roles. We present each annotation layer via basic label statistics and describe the final encoding of the resource in CoNLL and TEI formats. We also give a description of the rather turbulent history of the resource and give insights into the topic and genre distribution in the corpus. Finally, we discuss further enrichments of the corpus with additional layers, which are already underway.
Keyword:
annotation
;
computational linguistics
;
Croatian
;
Digital Humanities
;
linguistic resource
;
machine learning
;
reference corpus
;
Slavic Languages and Societies
URL:
https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1254&context=scschcomcon
https://arrow.tudublin.ie/scschcomcon/244
BASE
Hide details
Mobile view
All
Catalogues
UB Frankfurt Linguistik
0
IDS Mannheim
0
OLC Linguistik
0
UB Frankfurt Retrokatalog
0
DNB Subject Category Language
0
Institut für Empirische Sprachwissenschaft
0
Leibniz-Centre General Linguistics (ZAS)
0
Bibliographies
BLLDB
0
BDSL
0
IDS Bibliografie zur deutschen Grammatik
0
IDS Bibliografie zur Gesprächsforschung
0
IDS Konnektoren im Deutschen
0
IDS Präpositionen im Deutschen
0
IDS OBELEX meta
0
MPI-SHH Linguistics Collection
0
MPI for Psycholinguistics
0
Linked Open Data catalogues
Annohub
0
Online resources
Link directory
0
Journal directory
0
Database directory
0
Dictionary directory
0
Open access documents
BASE
2
Linguistik-Repository
0
IDS Publikationsserver
0
Online dissertations
0
Language Description Heritage
0
© 2013 - 2024 Lin|gu|is|tik
|
Imprint
|
Privacy Policy
|
Datenschutzeinstellungen ändern