DE eng

Search in the Catalogues and Directories

Hits 1 – 18 of 18

1
Training corpus hr500k 1.0
Ljubešić, Nikola; Agić, Željko; Klubička, Filip. - : Jožef Stefan Institute, 2018
BASE
Show details
2
hr500k – A Reference Training Corpus of Croatian.
In: Conference papers (2018)
Abstract: In this paper we present hr500k, a Croatian reference training corpus of 500 thousand tokens, segmented at document, sentence and word level, and annotated for morphosyntax, lemmas, dependency syntax, named entities, and semantic roles. We present each annotation layer via basic label statistics and describe the final encoding of the resource in CoNLL and TEI formats. We also give a description of the rather turbulent history of the resource and give insights into the topic and genre distribution in the corpus. Finally, we discuss further enrichments of the corpus with additional layers, which are already underway.
Keyword: annotation; computational linguistics; Croatian; Digital Humanities; linguistic resource; machine learning; reference corpus; Slavic Languages and Societies
URL: https://arrow.tudublin.ie/cgi/viewcontent.cgi?article=1254&context=scschcomcon
https://arrow.tudublin.ie/scschcomcon/244
BASE
Hide details
3
Croatian Twitter training corpus ReLDI-NormTag-hr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
4
Serbian Twitter training corpus ReLDI-NormTag-sr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
5
Croatian Twitter training corpus ReLDI-NormTag-hr 1.0
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
6
Serbian Twitter training corpus ReLDI-NormTag-sr 1.1
Ljubešić, Nikola; Farkaš, Daša; Klubička, Filip. - : Jožef Stefan Institute, 2017
BASE
Show details
7
Serbian-English parallel corpus srenWaC 1.0
Ljubešić, Nikola; Esplà-Gomis, Miquel; Ortiz Rojas, Sergio. - : Jožef Stefan Institute, 2016
BASE
Show details
8
Finnish-English parallel corpus fienWaC 1.0
Ljubešić, Nikola; Esplà-Gomis, Miquel; Ortiz Rojas, Sergio. - : Jožef Stefan Institute, 2016
BASE
Show details
9
Serbian web corpus srWaC 1.1
Ljubešić, Nikola; Klubička, Filip. - : Jožef Stefan Institute, 2016
BASE
Show details
10
Inflectional lexicon hrLex 1.0
Ljubešić, Nikola; Klubička, Filip. - : Faculty of Humanities and Social Sciences, University of Zagreb, 2016
BASE
Show details
11
Inflectional lexicon hrLex 1.2
Ljubešić, Nikola; Klubička, Filip; Boras, Damir. - : Faculty of Humanities and Social Sciences, University of Zagreb, 2016
BASE
Show details
12
Tourism English-Croatian Parallel Corpus 2.0
Toral, Antonio; Esplà-Gomis, Miquel; Klubička, Filip. - : Abu-MaTran project, 2016
BASE
Show details
13
Inflectional lexicon srLex 1.2
Ljubešić, Nikola; Klubička, Filip; Boras, Damir. - : Faculty of Humanities and Social Sciences, University of Zagreb, 2016
BASE
Show details
14
Croatian-English parallel corpus hrenWaC 2.0
Ljubešić, Nikola; Esplà-Gomis, Miquel; Ortiz Rojas, Sergio. - : Jožef Stefan Institute, 2016
BASE
Show details
15
Inflectional lexicon srLex 1.0
Ljubešić, Nikola; Klubička, Filip. - : Faculty of Humanities and Social Sciences, University of Zagreb, 2016
BASE
Show details
16
Croatian web corpus hrWaC 2.1
Ljubešić, Nikola; Klubička, Filip. - : Jožef Stefan Institute, 2016
BASE
Show details
17
Slovene-English parallel corpus slenWaC 1.0
Ljubešić, Nikola; Esplà-Gomis, Miquel; Ortiz Rojas, Sergio. - : Jožef Stefan Institute, 2016
BASE
Show details
18
Bosnian web corpus bsWaC 1.1
Ljubešić, Nikola; Klubička, Filip. - : Jožef Stefan Institute, 2016
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
18
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern