Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6...9

Hits 21 – 40 of 174

21	Corpus of Croatian news portals ENGRI (2014-2018)
	Bogunović, Irena; Kučić, Mario; Ljubešić, Nikola. - : University of Rijeka, Faculty of Maritime Studies, 2021
	BASE
	Show details

22	Offensive language dataset of Croatian, English and Slovenian comments FRENK 1.1
	Ljubešić, Nikola; Fišer, Darja; Erjavec, Tomaž. - : Jožef Stefan Institute, 2021
	BASE
	Show details

23	Spoken corpus Gos VideoLectures 4.1 (transcription)
	Verdonik, Darinka; Potočnik, Tomaž; Sepesy Maučec, Mirjam. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2021
	BASE
	Show details

24	Corpus of Slovenian school texts SBSJ 1.0
	Ahačič, Kozma; Atelšek, Simon; Erjavec, Tomaž. - : ZRC SAZU, 2021
	BASE
	Show details

25	Abstracts from the KAS corpus KAS-Abs 1.0
	Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2021. : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2021
	BASE
	Show details

26	Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.1
	Erjavec, Tomaž; Ogrodniczuk, Maciej; Osenova, Petya. - : CLARIN ERIC, 2021
	BASE
	Show details

27	Linguistically annotated multilingual comparable corpora of parliamentary debates ParlaMint.ana 2.0
	Erjavec, Tomaž; Ogrodniczuk, Maciej; Osenova, Petya. - : CLARIN ERIC, 2021
	BASE
	Show details

28	Corpus of term-annotated texts RSDO5 1.0
	Jemec Tomazin, Mateja; Trojar, Mitja; Žagar, Mojca. - : ZRC SAZU, 2021
	BASE
	Show details

29	Multilingual comparable corpora of parliamentary debates ParlaMint 2.0
	Erjavec, Tomaž; Ogrodniczuk, Maciej; Osenova, Petya. - : CLARIN ERIC, 2021
	BASE
	Show details

30	Corpus of Written Standard Slovene Gigafida 2.0
	Krek, Simon; Erjavec, Tomaž; Repar, Andraž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
	BASE
	Show details

31	The corpus of older Slovenian narrative prose PriLit 1.0
	Žejn, Andrejka; Erjavec, Tomaž. - : ZRC SAZU, 2021
	BASE
	Show details

32	Creating the European Literary Text Collection (ELTeC): Challenges and Perspectives ...
	Schöch, Christof; Erjavec, Tomaz; Patras, Roxana. - : Zenodo, 2021
	BASE
	Show details

33	Creating the European Literary Text Collection (ELTeC): Challenges and Perspectives ...
	Schöch, Christof; Erjavec, Tomaz; Patras, Roxana. - : Zenodo, 2021
	BASE
	Show details

34	The KAS corpus of Slovenian academic writing [<Journal>]
	Erjavec, Tomaž [Verfasser]; Fišer, Darja [Verfasser]; Ljubešić, Nikola [Verfasser]
	DNB Subject Category Language
	Show details

35	Universal Dependencies 2.7
	Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
	BASE
	Show details

36	Universal Dependencies 2.6
	Zeman, Daniel; Nivre, Joakim; Abrams, Mitchell. - : Universal Dependencies Consortium, 2020
	BASE
	Show details

37	The CLASSLA-StanfordNLP model for lemmatisation of standard Macedonian 1.0
	Ljubešić, Nikola; Zdravkova, Katerina; Erjavec, Tomaž. - : Jožef Stefan Institute, 2020
	BASE
	Show details

38	The CLASSLA-StanfordNLP model for morphosyntactic annotation of standard Macedonian 1.0
	Ljubešić, Nikola; Zdravkova, Katerina; Stojanoska, Sanja. - : Jožef Stefan Institute, 2020
	BASE
	Show details

39	Multilingual comparable corpora of parliamentary debates ParlaMint 1.0
	Erjavec, Tomaž; Grigorova, Vladislava; Ljubešić, Nikola. - : CLARIN ERIC, 2020
	BASE
	Show details

40	Slovenian parliamentary corpus (1990-2018) siParl 2.0
	Pančur, Andrej; Erjavec, Tomaž; Ojsteršek, Mihael; Šorn, Mojca; Blaj Hribar, Neja. - : Institute of Contemporary History, 2020
	Abstract: The siParl corpus contains minutes of the Assembly of the Republic of Slovenia for 11th legislative period 1990-1992, minutes of the National Assembly of the Republic of Slovenia from the 1st to the 7th legislative period 1992-2018, minutes of the working bodies of the National Assembly of the Republic of Slovenia from the 2nd to the 7th legislative period 1996-2018, and minutes of the Council of the President of the National Assembly from the 2nd to the 7th legislative period 1996-2018. The corpus comprises over 10 thousand sessions, one million speeches or 200 million words. The corpus contains meta-data about the speakers, a typology of sessions etc. and structural, editorial and linguistic annotations. The corpus is encoded according to the Parla-CLARIN schema (https://github.com/clarin-eric/parla-clarin). Each mandate is in one directory, and each session in one file. This item comprises the following datasets: 1. source DARAH-SI Parla-CLARIN encoded corpus; 2. linguistically annotatated Parla-CLARIN encoded corpus: tokenisation, MSD tagging, lemmatisation, Universal Dependencies features and syntactic parses, named entities; 3. linguisticaly annotated corpus in vertical format used by CWB and Sketch Engine concordancers; this format is simpler and smaller but does not contain all the information from the source TEI; 4. linguisticaly annotated corpus in CONLL-U format as used by Universal Dependencies 5. plain text of the corpus Note that each dataset also includes TSV meta-data files on sessions (files) and speakers. As opposed to the previous version 1.0, this version corrects many errors, has substantially better meta-data and the linguistic processing has more levels and less errors.
	Keyword: Parla-CLARIN; parliamentary debates; Slovenian Parliament; TEI; universal dependencies
	URL: http://hdl.handle.net/11356/1300
	BASE
	Hide details

Page: 1 2 3 4 5 6...9

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern