DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 22

1
Training corpus ssj500k 2.3
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2021
BASE
Show details
2
Training corpus ssj500k 2.2
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2019
BASE
Show details
3
Slovenian parliamentary corpus ParlaMeter-sl 1.0
Dobranić, Filip; Ljubešić, Nikola; Erjavec, Tomaž. - : Jožef Stefan Institute, 2019
BASE
Show details
4
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
5
CMC training corpus Janes-Tag 2.1
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2019
BASE
Show details
6
Croatian parliamentary corpus ParlaMeter-hr 1.0
Dobranić, Filip; Ljubešić, Nikola; Erjavec, Tomaž. - : Jožef Stefan Institute, 2019
BASE
Show details
7
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.1
Ljubešić, Nikola; Erjavec, Tomaž; Batanović, Vuk. - : Jožef Stefan Institute, 2019
BASE
Show details
8
Training corpus SETimes.SR 1.0
Batanović, Vuk; Ljubešić, Nikola; Samardžić, Tanja. - : Regional Linguistic Data Initiative Centre ReLDI, 2018
BASE
Show details
9
Training corpus ssj500k 2.1
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2018
BASE
Show details
10
Training corpus hr500k 1.0
Ljubešić, Nikola; Agić, Željko; Klubička, Filip. - : Jožef Stefan Institute, 2018
BASE
Show details
11
ReLDI token+tag+lemma+NER web service for WebLicht
Ljubešić, Nikola; Perovšek, Matic; Erjavec, Tomaž. - : Jožef Stefan Institute, 2017
BASE
Show details
12
CMC training corpus Janes-Tag 2.0
Erjavec, Tomaž; Fišer, Darja; Čibej, Jaka. - : Jožef Stefan Institute, 2017
BASE
Show details
13
Serbian Twitter training corpus ReLDI-NormTagNER-sr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
14
Wikipedia talk corpus Janes-Wiki 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
15
Training corpus ssj500k 2.0
Krek, Simon; Dobrovoljc, Kaja; Erjavec, Tomaž. - : Centre for Language Resources and Technologies, University of Ljubljana, 2017
BASE
Show details
16
News comment corpus Janes-News 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
17
Croatian Twitter training corpus ReLDI-NormTagNER-hr 2.0
Ljubešić, Nikola; Erjavec, Tomaž; Miličević, Maja. - : Jožef Stefan Institute, 2017
BASE
Show details
18
Blog post and comment corpus Janes-Blog 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
19
Forum corpus Janes-Forum 1.0
Erjavec, Tomaž; Ljubešić, Nikola; Fišer, Darja. - : Jožef Stefan Institute, 2017
BASE
Show details
20
Twitter corpus Janes-Tweet 1.0
Ljubešić, Nikola; Erjavec, Tomaž; Fišer, Darja. - : Jožef Stefan Institute, 2017
Abstract: Janes-Tweet is an annotated corpus of almost 10 million tweets posted from 2013-06 to 2017-06 by approx. 9,000 users that tweet mostly in Slovene. The corpus is structured into individual tweets, together with their metadata. The tweets in the corpus are tokenised, sentence segmented, word normalised, morphosyntactically tagged, lemmatised and annotated with named entities. Due to Twitter terms-of-service, the corpus is distributed in an encoded version. The included tweetpub program (also available and documented on https://github.com/clarinsi/tweetpub) should be used to decode it, which it does by fetching the original tweets and applying a diff operation on the distributed corpus. Note that the retrieved corpus can have fewer tweets than the distributed version if some have been removed from Twitter by their authors in the meantime.
Keyword: computer-mediated communication; named entities; Twitter; word normalisation
URL: http://hdl.handle.net/11356/1142
BASE
Hide details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
22
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern