46 |
Corpus extraction tool LIST 1.2
|
|
Krsnik, Luka; Arhar Holdt, Špela; Čibej, Jaka. - : Centre for Language Resources and Technologies, University of Ljubljana, 2019. : Faculty of Computer and Information Science, University of Ljubljana, 2019. : Jožef Stefan Institute, 2019
|
|
BASE
|
|
Show details
|
|
50 |
Frequency lists of character-level n-grams from the Gigafida 2.0 corpus
|
|
|
|
Abstract:
Frequency lists of character-level n-grams were extracted from the Gigafida 2.0 Corpus of Written Standard Slovene (https://viri.cjvt.si/gigafida/) using the LIST corpus extraction tool (http://hdl.handle.net/11356/1227). The lists contain 1-5-gram combinations of characters occurring in the corpus along with their absolute and relative frequencies, percentages, and distribution across the text-types included in the corpus taxonomy. Character-level n-grams were extracted from lemmas (5 files) and lower-case word forms (5 files).
|
|
Keyword:
characters; frequency list; n-grams; Slovenian language; standard language
|
|
URL: http://hdl.handle.net/11356/1272
|
|
BASE
|
|
Hide details
|
|
52 |
Developmental corpus (without language corrections) Šolar 2.0 Clear
|
|
|
|
BASE
|
|
Show details
|
|
53 |
Frequency lists of word-level n-grams from the Gigafida 2.0 corpus
|
|
|
|
BASE
|
|
Show details
|
|
54 |
Frequency lists of word-level n-grams from the GOS 1.0 corpus
|
|
|
|
BASE
|
|
Show details
|
|
55 |
Frequency lists of character-level n-grams from the GOS 1.0 corpus
|
|
|
|
BASE
|
|
Show details
|
|
56 |
Corpus extraction tool LIST 1.0
|
|
Krsnik, Luka; Arhar Holdt, Špela; Čibej, Jaka. - : Centre for Language Resources and Technologies, University of Ljubljana, 2019. : Faculty of Computer and Information Science, University of Ljubljana, 2019. : Jožef Stefan Institute, 2019
|
|
BASE
|
|
Show details
|
|
|
|