Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 19 of 19

1	The Janes project: language resources and tools for Slovene user generated content [<Journal>]
	Fišer, Darja [Verfasser]; Ljubešić, Nikola [Sonstige]; Erjavec, Tomaž [Sonstige]
	DNB Subject Category Language
	Show details

2	Universal Dependencies 2.2
	Nivre, Joakim; Abrams, Mitchell; Agić, Željko...
	In: https://hal.archives-ouvertes.fr/hal-01930733 ; 2018 (2018)
	BASE
	Show details

3	Universal Dependencies 2.3
	Nivre, Joakim; Abrams, Mitchell; Agić, Željko. - : Universal Dependencies Consortium, 2018
	BASE
	Show details

4	Universal Dependencies 2.2
	Nivre, Joakim; Abrams, Mitchell; Agić, Željko. - : Universal Dependencies Consortium, 2018
	BASE
	Show details

5	Dictionary of Twitterese Janes-Dict 1.0
	Gantar, Polona; Škrjanec, Iza; Fišer, Darja. - : Faculty of Arts, University of Ljubljana, 2018
	BASE
	Show details

6	English-Montenegrin parallel corpus of subtitles Opus-MontenegrinSubs 1.0
	Božović, Petar; Erjavec, Tomaž; Tiedemann, Jörg. - : Jožef Stefan Institute, 2018
	BASE
	Show details

7	Training corpus SETimes.SR 1.0
	Batanović, Vuk; Ljubešić, Nikola; Samardžić, Tanja. - : Regional Linguistic Data Initiative Centre ReLDI, 2018
	BASE
	Show details

8	Spoken corpus Gos VideoLectures 3.0 (transcription)
	Verdonik, Darinka; Potočnik, Tomaž; Sepesy Maučec, Mirjam; Erjavec, Tomaž. - : Faculty of Electrical Engineering and Computer Science, University of Maribor, 2018
	Abstract: Gos VideoLectures is an add-on to the Gos reference corpus of spoken Slovene (http://hdl.handle.net/11356/1040), and covers public academic speech. The Gos VideoLectures corpus contains a selection of public lectures available through the web portal Videolectures.net provided by the Jožef Stefan Institute, and covers 37 lectures and 16 hours of speech. This resource contains only annotated transcriptions of the corpus – audio recordings are available at http://hdl.handle.net/11356/1189. All transcriptions for Gos VideoLectures were done manually and carefully checked. The main guidelines for transcription were those of the Gos corpus (http://www.korpus-gos.net/Support/About). The transcription tool Transcriber 1.5.1 (http://trans.sourceforge.net/en/presentation.php) was used for making transcriptions. It can be also used for reading or exporting transcriptions (.trs files) to different formats. The transcriptions comprise the TRS files with tabular metadata, their conversion to TEI and to the CWB vertical file format. Each recording has two TRS files, one with pronunciation-based and the other with the standardised/normalised transcription. The TEI and CWB encodings join these two transcriptions at the token level, with the normalised words being also automatically PoS tagged and lemmatised. The corpus can be used for training continuous speech recognition for Slovene language, for phonetic research or any other research of Slovene academic speech.
	Keyword: academic speech; speech database; speech recognition; speech transcription; spoken corpus; TEI
	URL: http://hdl.handle.net/11356/1190
	BASE
	Hide details

9	Automatically constructed multiword lexicon slMWELex v0.5
	Ljubešić, Nikola; Krek, Simon; Dobrovoljc, Kaja. - : Jožef Stefan Institute, 2018
	BASE
	Show details

10	Dataset and baseline model of moderated content FRENK-MMC-RTV 1.0
	Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2018
	BASE
	Show details

11	JRC EU DGT Translation Memory Parsebank DGT-UD 1.0
	Ljubešić, Nikola; Erjavec, Tomaž. - : Jožef Stefan Institute, 2018
	BASE
	Show details

12	Training corpus ssj500k 2.1
	Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2018
	BASE
	Show details

13	Word embeddings CLARIN.SI-embed.sl 1.0
	Ljubešić, Nikola; Erjavec, Tomaž. - : Jožef Stefan Institute, 2018
	BASE
	Show details

14	Bilingual terminology extraction dataset KAS-biterm 1.0
	Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2018
	BASE
	Show details

15	Terminology identification dataset KAS-term 1.0
	Erjavec, Tomaž; Fišer, Darja; Ljubešić, Nikola. - : Jožef Stefan Institute, 2018
	BASE
	Show details

16	Croatian language corpus Riznica 0.1
	Brozović Rončević, Dunja; Ćavar, Damir; Ćavar, Małgorzata. - : Institute of Croatian Language and Linguistics, 2018
	BASE
	Show details

17	Training corpus hr500k 1.0
	Ljubešić, Nikola; Agić, Željko; Klubička, Filip. - : Jožef Stefan Institute, 2018
	BASE
	Show details

18	Dataset and baseline model of moderated content FRENK-STYRIA-24sata 1.0
	Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2018
	BASE
	Show details

19	hr500k – A Reference Training Corpus of Croatian.
	Erjavec, Tomaž; Ljubešić, Nikola; Klubicka, Filip...
	In: Conference papers (2018)
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern