2 |
Croatian corpus of non-professional written language by typical speakers and speakers with language disorders RAPUT 1.0
|
|
|
|
Abstract:
The corpus consists of texts produced by nonprofessional typical speakers and speakers with different language disorders (developmental language disorder, dyslexia, traumatic brain injury, aphasia, other). Roughly half of the corpus consists of texts of typical speakers, and the other half of speakers with language disorders. Language samples were elicited by six groups of tasks representing different writing styles (descriptive, expository, narrative, and letter) and different levels of formality. The corpus has been manually annotated for normalized forms, lemmas, morphosyntactic information (by following the MULTEXT-East tagset), and type of error (phonological segmentation, orthography, non-standard spelling, typo, syntax, etc.). UD morphosyntactic description has been to the most part automatically generated from the MULTEXT-East morphosyntactic information.
|
|
Keyword:
non-professional written language; speakers with language disorders; typical speakers
|
|
URL: http://hdl.handle.net/11356/1435
|
|
BASE
|
|
Hide details
|
|
3 |
The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.1
|
|
|
|
BASE
|
|
Show details
|
|
4 |
The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The CLASSLA-StanfordNLP model for lemmatisation of non-standard Serbian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
6 |
The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Serbian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
7 |
The CLASSLA-StanfordNLP model for morphosyntactic annotation of non-standard Croatian 1.0
|
|
|
|
BASE
|
|
Show details
|
|
8 |
The CLASSLA-StanfordNLP model for lemmatisation of non-standard Croatian 1.1
|
|
|
|
BASE
|
|
Show details
|
|
|
|