DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 22

1
SYN v9: large corpus of written Czech
Abstract: Corpus of contemporary written (printed) Czech sized 4.7 GW (i.e. 5.7 billion tokens). It covers mostly the 1990-2019 period and features rich metadata including detailed bibliographical information, text-type classification etc. SYN v9 contains a wide variety of text types (fiction, non-fiction, newspapers), but the newspapers prevail noticeably. The corpus is lemmatized and morphologically tagged by the new CNC tagset first utilized for the annotation of the SYN2020 corpus. SYN v9 is provided in a CoNLL-U-like vertical format used as an input to the Manatee query engine. The data thus correspond to the corpus available via the KonText query interface to the registered users of CNC at http://www.korpus.cz with one important exception: the corpus is shuffled, i.e. divided into blocks sized max. 100 words (respecting the sentence boundaries) with ordering randomized within the given document.
Keyword: corpus; written language
URL: http://hdl.handle.net/11234/1-4635
BASE
Hide details
2
Introducing the International Comparable Corpus
Kirk, John [Verfasser]; Čermáková, Anna [Verfasser]; Oksefjell Ebeling, Signe [Verfasser]. - Mannheim : Institut für Deutsche Sprache, Bibliothek, 2018
DNB Subject Category Language
Show details
3
ORAL2013: balanced corpus of informal spoken Czech (transcriptions)
Benešová, Lucie; Křen, Michal; Waclawičová, Martina. - : Charles University, Faculty of Arts, Institute of the Czech National Corpus, 2016
BASE
Show details
4
SYN v4: large corpus of written Czech
Křen, Michal; Cvrček, Václav; Čapka, Tomáš. - : Charles University, Faculty of Arts, Institute of the Czech National Corpus, 2016
BASE
Show details
5
ORAL2013: balanced corpus of informal spoken Czech (transcriptions & audio)
Benešová, Lucie; Křen, Michal; Waclawičová, Martina. - : Charles University, Faculty of Arts, Institute of the Czech National Corpus, 2016
BASE
Show details
6
Lexical Coverage Evaluation of Large-scale Multilingual Semantic Lexicons ...
EL-Haj, Mahmoud; Piao, Scott; Rayson, Paul. - : Unpublished, 2016
BASE
Show details
7
Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages
Piao, Scott Songlin; Rayson, Paul Edward; Archer, Dawn. - : European Language Resources Association (ELRA), 2016
BASE
Show details
8
Lexical coverage evaluation of large-scale multilingual semantic lexicons for twelve languages
BASE
Show details
9
Korpus spontánní mluvené češtiny ORAL2013 : = The corpus of spontaneous spoken Czech ORAL2013
In: Časopis pro moderní filologii. - Praha : Ústav pro Jazyk Český AV ČR 97 (2015) 1, 42-50
BLLDB
Show details
10
SYN2015: representative corpus of written Czech
Křen, Michal; Cvrček, Václav; Čapka, Tomáš. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2015
BASE
Show details
11
Building a Data Repository of Spontaneous Spoken Czech
In: Best Practices for Spoken Corpora in Linguistic Research (2014), 128-141
IDS Bibliografie zur Gesprächsforschung
Show details
12
SYN2013PUB: corpus of written Czech newspapers
Křen, Michal; Hnátková, Milena; Jelínek, Tomáš. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2014
BASE
Show details
13
Co je v ČNK nového, 3
In: Korpus, gramatika, axiologie. - Hradec Králové : Univerzita Hradec Králové 7 (2013), 98-100
BLLDB
Show details
14
SYN2009PUB: corpus of Czech newspapers
Křen, Michal; Bartoň, Tomáš; Hnátková, Milena. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2013
BASE
Show details
15
ORAL2008: Balanced corpus of informal spoken Czech
Waclawičová, Martina; Kopřivová, Marie; Křen, Michal. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2013
BASE
Show details
16
SYN2005: balanced corpus of written Czech
Čermák, František; Hlaváčová, Jaroslava; Hnátková, Milena. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2013
BASE
Show details
17
SYN2006PUB: corpus of Czech newspapers
Čermák, František; Hlaváčová, Jaroslava; Hnátková, Milena. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2013
BASE
Show details
18
SYN2010: balanced corpus of written Czech
Křen, Michal; Bartoň, Tomáš; Cvrček, Václav. - : Faculty of Arts, Institute of the Czech National Corpus, Charles University in Prague, 2013
BASE
Show details
19
New generation corpus-based frequency dictionaries : the case of Czech
In: International journal of corpus linguistics. - Amsterdam [u.a.] : Benjamins 10 (2005) 4, 453-467
BLLDB
OLC Linguistik
Show details
20
The International Comparable Corpus: Challenges in building multilingual spoken and written comparable corpora [Online resource]
IDS-Repository
Show details

Page: 1 2

Catalogues
0
0
1
0
1
0
0
Bibliographies
3
0
0
1
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
14
0
3
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern