DE eng

Search in the Catalogues and Directories

Hits 1 – 9 of 9

1
Mining an English-Chinese parallel Dataset of Financial News
In: Journal of Open Humanities Data; Vol 8 (2022); 9 ; 2059-481X (2022)
Abstract: Parallel text datasets are a valuable for educational purposes, machine translation, and cross-language information retrieval, but few are domain-oriented. We have created a Chinese–English parallel dataset in the domain of finance technology, using the Financial Times website, from which we grabbed 60,473 news items from between 2007 and 2021. This dataset is a bilingual Chinese–English parallel dataset of news in the domain of finance. It is open access in its original state without transformation, and has been made not for machine translation as has been used, but for intelligent mining, in which we conducted many experiments using up-to-date text mining techniques: clustering (topic modeling, community detection, k-means), topic prediction (naive Bayes, SVM, LSTM, Bert), and pattern discovery (dictionary based, time series). We present the usage of these techniques as a framework for other studies, not only as an application but with an interpretation.
Keyword: classification; clustering; computer science; English-Chinese; patterns; text mining
URL: https://openhumanitiesdata.metajnl.com/jms/article/view/62
https://doi.org/10.5334/johd.62
BASE
Hide details
2
Source Code for Youtube dataset processing ...
TURENNE, Nicolas. - : Zenodo, 2022
BASE
Show details
3
Source Code for Youtube dataset processing ...
TURENNE, Nicolas. - : Zenodo, 2022
BASE
Show details
4
The rumour spectrum
In: ISSN: 1932-6203 ; EISSN: 1932-6203 ; PLoS ONE ; https://hal.archives-ouvertes.fr/hal-01691934 ; PLoS ONE, Public Library of Science, 2018, 13 (1), pp.e0189080.1-27. ⟨10.1371/journal.pone.0189080⟩ (2018)
BASE
Show details
5
A semi-supervised Learning Approach to find equivalent long-string Organization Names
In: Colloque- Forum PEPS EXIA ; https://hal-enpc.archives-ouvertes.fr/hal-02310298 ; Colloque- Forum PEPS EXIA, Oct 2016, Champs sur Marne, France. 2016 (2016)
BASE
Show details
6
Clustering and Relational Ambiguity: from Text Data to Natural Data
In: EISSN: 2416-5999 ; Journal of Data Mining and Digital Humanities ; https://hal.archives-ouvertes.fr/hal-00920423 ; Journal of Data Mining and Digital Humanities, Episciences.org, 2013, 1 (1), pp.1 (2013)
BASE
Show details
7
Modelling noun-phrase dynamics in specialized text collections
In: Journal of quantitative linguistics. - London : Routledge 17 (2010) 3, 212-228
BLLDB
OLC Linguistik
Show details
8
Modeling Noun-Phrases Dynamics in Specialized Text Collections
In: ISSN: 0929-6174 ; Journal of Quantitative Linguistics ; https://hal.archives-ouvertes.fr/hal-02054488 ; Journal of Quantitative Linguistics, Taylor & Francis (Routledge), 2010, 17 (3), pp.212-228. ⟨10.1080/09296174.2010.485447⟩ (2010)
BASE
Show details
9
Bayesian Discriminant Analysis for Lexical Semantic Tagging
In: European Meeting on Cybernetics and Systems Research (EMCSR) ; https://hal.archives-ouvertes.fr/hal-03373905 ; European Meeting on Cybernetics and Systems Research (EMCSR), Apr 2002, Vienne, Austria (2002)
BASE
Show details

Catalogues
0
0
1
0
0
0
0
Bibliographies
1
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
8
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern