Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 11 of 11

1	Closing a gap in the language resources landscape : Groundwork and best practices from projects on computer-mediated communication in four European countries.
	Beißwenger, Michael; Chanier, Thierry; Chiari, Isabella. - : HAL CCSD, 2017. : Linköping Electronic Conference Proceedings, 2017
	In: CLARIN Annual Conference 2016 ; https://hal.archives-ouvertes.fr/hal-01379621 ; CLARIN Annual Conference 2016, Oct 2016, Aix-en-Provence, France. 136, Linköping Electronic Conference Proceedings, pp.1-19, 2017, Selected papers from the CLARIN Annual Conference 2016, 978-91-7685-499-0 ; http://www.ep.liu.se/ecp/contents.asp?issue=136 (2017)
	BASE
	Show details

2	Tweet code-switching corpus Janes-Preklop 1.0
	Reher, Špela; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

3	CMC training corpus Janes-Tag 2.0
	Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2017
	BASE
	Show details

4	Wikipedia talk corpus Janes-Wiki 1.0
	Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

5	CMC training corpus Janes-Syn 1.0
	Arhar Holdt, Špela; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
	Abstract: Janes-Syn is a syntactically annotated corpus of Slovene tweets and is meant as a gold-standard training and testing dataset for syntactic annotation of Slovene computer-mediated communication and for detailed linguistic explorations which require highly accurate and reliable annotations. Words in the dataset are normalised, lemmatised, PoS-tagged and syntactically annotated with the JOS dependency model (http://eng.slovenscina.eu/tehnologije/razclenjevalnik). The annotations on all levels were manually corrected. The corpus creation and structure are described in: ARHAR HOLDT, Špela, FIŠER, Darja, ERJAVEC, Tomaž, KREK, Simon. Syntactic annotation of Slovene CMC : first steps. Proceedings of the 4th Conference on CMC and Social Media Corpora for the Humanities, 27-28 September 2016, Ljubljana, Slovenia, 2016, pp. 3-6. http://nl.ijs.si/janes/cmc-corpora2016/proceedings/ Janes-Syn was created from two larger corpora that are also available in the repository: Janes-Norm (http://hdl.handle.net/11356/1084) and Janes-Tag (http://hdl.handle.net/11356/1123).
	Keyword: computer-mediated communication; dependency treebank; manual annotation; syntactic annotation; TEI; tokenisation
	URL: http://hdl.handle.net/11356/1086
	BASE
	Hide details

6	News comment corpus Janes-News 1.0
	Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

7	Tweet comma corpus Janes-Vejica 1.0
	Popič, Damjan; Zupan, Katja; Logar, Polona. - : Jožef Stefan Institute, 2017
	BASE
	Show details

8	Blog post and comment corpus Janes-Blog 1.0
	Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

9	Forum corpus Janes-Forum 1.0
	Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

10	CMC shortening corpus Janes-Kratko 1.0
	Goli, Teja; Osrajnik, Eneja; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

11	Twitter corpus Janes-Tweet 1.0
	Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern