DE eng

Search in the Catalogues and Directories

Hits 1 – 3 of 3

1
Prague Dependency Treebank -- Consolidated 1.0 ...
Abstract: We present a richly annotated and genre-diversified language resource, the Prague Dependency Treebank-Consolidated 1.0 (PDT-C 1.0), the purpose of which is - as it always been the case for the family of the Prague Dependency Treebanks - to serve both as a training data for various types of NLP tasks as well as for linguistically-oriented research. PDT-C 1.0 contains four different datasets of Czech, uniformly annotated using the standard PDT scheme (albeit not everything is annotated manually, as we describe in detail here). The texts come from different sources: daily newspaper articles, Czech translation of the Wall Street Journal, transcribed dialogs and a small amount of user-generated, short, often non-standard language segments typed into a web translator. Altogether, the treebank contains around 180,000 sentences with their morphological, surface and deep syntactic annotation. The diversity of the texts and annotations should serve well the NLP applications as well as it is an invaluable resource for ... : Accepted at LREC 2020 (Proceedings of Language Resources and Evaluation, Marseille, France) ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/2006.03679
https://dx.doi.org/10.48550/arxiv.2006.03679
BASE
Hide details
2
ForFun 1.0
Mikulová, Marie; Bejček, Eduard. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2017
BASE
Show details
3
Prague Dependency Treebank 2.0 ...
Sgall, Petr; Pajas, Petr; Mikulová, Marie. - : Linguistic Data Consortium, 2006
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
3
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern