DE eng

Search in the Catalogues and Directories

Hits 1 – 18 of 18

1
A bilingual approach to specialised adjectives through word embeddings in the karstology domain ...
BASE
Show details
2
A bilingual approach to specialised adjectives through word embeddings in the karstology domain ...
BASE
Show details
3
A bilingual approach to specialised adjectives through word embeddings in the karstology domain ...
BASE
Show details
4
EMBEDDIA tools output example corpus of Estonian, Croatian and Latvian news articles 1.0
Abstract: This dataset contains articles from EMBEDDIA Media partners with various information added by the tools developed within the EMBEDDIA project: - 12,390 Estonian articles from 2019 with tags given by Ekspress Meedia. The complete dataset without the output of EMBEDDIA tools is available at http://hdl.handle.net/11356/1408 - 5,000 Croatian articles from autumn of 2010 with tags given by 24sata. The complete dataset without the output of EMBEDDIA tools is available at http://hdl.handle.net/11356/1410 - 15,264 Latvian articles from 2019 with tags given by Ekspress Meedia. The complete dataset without the output of EMBEDDIA tools is available at http://hdl.handle.net/11356/1409 All the articles in the dataset have been analysed with texta-mlp Python package (https://pypi.org/project/texta-mlp/) via the EMBEDDIA Media assistant's Texta Toolkit (https://docs.texta.ee/). The tools used to analyse the articles were the following: - Latin1 and Latin2 Name Entity Recognition Tool modules (Cabrera-Diego et al., 2021, both described in https://aclanthology.org/2021.bsnlp-1.12/) . The Latin 1 results can be found folders annotated_articles_ner_latin1/ and annotated_articles_all_tools/, while the Latin 2 results are in annotated_articles_nerlatin2/ or annotated_articles_all_tools/. - RAKUN keyword extractor. RAKUN (Škrlj et al. 2019) is an unsupervised system for keyword extraction, so it can be used for any language. It detects keywords by turning text into a graph and the most important nodes in the graph mostly turn out to be the keywords. It is described in https://link.springer.com/chapter/10.1007/978-3-030-31372-2_26. The keyword annotation results can be found in the folder annotated_articles_rakun/ or annotated_articles_all_tools/. - TNT-KID keyword extractor. TNT-KID (Martinc et al. 2021, ) is a supervised system for automatic keyword extraction. It was trained on a corpus of articles with human-assigned keywords. For Croatian, the annotators were 24sata editors, for Estonian the Ekspress Meedia staff and for Latvian the Latvian Delfi staff. The system is further documented at https://doi.org/10.1017/S1351324921000127. For Croatian only TNT-KID was applied, while for Estonian and Latvian, the TNT-KID with TF-IDF, and extension by Koloski et al. (https://aclanthology.org/2021.hackashop-1.4.pdf) was used. The results of applying this tool are found in the folder annotated articles tnt_kid/ or annotated articles all tools/. - Sentiment analysis. Our news sentiment analyser (Pelicon et al. 2020) labels a news article as being of positive, negative, or neutral sentiment, using a fine-tuned multilingual BERT model, which was trained on Slovene sentiment annotated news articles. The system is further documented in https://doi.org/10.3390/app10175993. The results of this tools are found in the folder annotated articles sentiment/ or annotated articles all tools/. All the data is encoded in "JSON Lines" format. Each folder has its own README file which explains the structure of the files.
Keyword: keyword extraction; named entity recognition; sentiment classification
URL: http://hdl.handle.net/11356/1485
BASE
Hide details
5
Out of Thin Air: Is Zero-Shot Cross-Lingual Keyword Detection Better Than Unsupervised? ...
BASE
Show details
6
Word-embedding based bilingual terminology alignment ...
BASE
Show details
7
Word-embedding based bilingual terminology alignment ...
BASE
Show details
8
Keyword extraction datasets for Croatian, Estonian, Latvian and Russian 1.0
Koloski, Boshko; Pollak, Senja; Škrlj, Blaž. - : Ekspress Meedia Group, 2021. : Styria Media Group, 2021
BASE
Show details
9
24sata news article archive 1.0
Purver, Matthew; Shekhar, Ravi; Pranjić, Marko. - : Styria Media Group, 2021
BASE
Show details
10
Discovery Team at SemEval-2020 Task 1: Context-sensitive Embeddings not Always Better Than Static for Semantic Change Detection ...
BASE
Show details
11
Discovery Team at SemEval-2020 Task 1: Context-sensitive Embeddings not Always Better Than Static for Semantic Change Detection ...
BASE
Show details
12
Temporal Integration of Text Transcripts and Acoustic Features for Alzheimer's Diagnosis Based on Spontaneous Speech
In: Front Aging Neurosci (2021)
BASE
Show details
13
Mining semantic relations from comparable corpora through intersections of word embeddings. ...
BASE
Show details
14
Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift ...
BASE
Show details
15
Mining semantic relations from comparable corpora through intersections of word embeddings. ...
BASE
Show details
16
Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift ...
BASE
Show details
17
Reproduction, replication, analysis and adaptation of a term alignment approach [<Journal>]
Repar, Andraž [Verfasser]; Martinc, Matej [Verfasser]; Pollak, Senja [Verfasser]
DNB Subject Category Language
Show details
18
Leveraging Contextual Embeddings for Detecting Diachronic Semantic Shift ...
BASE
Show details

Catalogues
0
0
0
0
1
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
17
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern