3 |
CzeSL Grammatical Error Correction Dataset (CzeSL-GEC)
|
|
Šebesta, Karel; Bedřichová, Zuzanna; Šormová, Kateřina; Štindlová, Barbora; Hrdlička, Milan; Hrdličková, Tereza; Hana, Jiří; Petkevič, Vladimír; Jelínek, Tomáš; Škodová, Svatava; Janeš, Petr; Lundáková, Kateřina; Skoumalová, Hana; Sládek, Šimon; Pierscieniak, Piotr; Toufarová, Dagmar; Straka, Milan; Rosen, Alexandr; Náplava, Jakub; Poláčková, Marie. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2017
|
|
Abstract:
CzeSL-GEC is a corpus containing sentence pairs of original and corrected versions of Czech sentences collected from essays written by both non-native learners of Czech and Czech pupils with Romani background. To create this corpus, unreleased CzeSL-man corpus (http://utkl.ff.cuni.cz/learncorp/) was utilized. All sentences in the corpus are word tokenized.
|
|
Keyword:
grammatical error correction; natural language correction
|
|
URL: http://hdl.handle.net/11234/1-2143
|
|
BASE
|
|
Hide details
|
|
|
|