1 |
Between History and Natural Language Processing: Study, Enrichment and Online Publication of French Parliamentary Debates of the Early Third Republic (1881-1899)
|
|
|
|
In: ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora ; https://hal.archives-ouvertes.fr/hal-03623351 ; ParlaCLARIN III at LREC2022 - Workshop on Creating, Enriching and Using Parliamentary Corpora, Jun 2022, Marseille, France ; https://www.clarin.eu/ParlaCLARIN-III (2022)
|
|
BASE
|
|
Show details
|
|
2 |
Chinese-Uyghur Bilingual Lexicon Extraction Based on Weak Supervision
|
|
|
|
In: Information; Volume 13; Issue 4; Pages: 175 (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Investigating the Efficient Use of Word Embedding with Neural-Topic Models for Interpretable Topics from Short Texts
|
|
|
|
In: Sensors; Volume 22; Issue 3; Pages: 852 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Analysis of the Effects of Lockdown on Staff and Students at Universities in Spain and Colombia Using Natural Language Processing Techniques
|
|
|
|
In: International Journal of Environmental Research and Public Health; Volume 19; Issue 9; Pages: 5705 (2022)
|
|
BASE
|
|
Show details
|
|
5 |
An Enhanced Neural Word Embedding Model for Transfer Learning
|
|
|
|
In: Applied Sciences; Volume 12; Issue 6; Pages: 2848 (2022)
|
|
BASE
|
|
Show details
|
|
6 |
Deep Sentiment Analysis Using CNN-LSTM Architecture of English and Roman Urdu Text Shared in Social Media
|
|
|
|
In: Applied Sciences; Volume 12; Issue 5; Pages: 2694 (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Predicting Academic Performance: Analysis of Students’ Mental Health Condition from Social Media Interactions
|
|
|
|
In: Behavioral Sciences; Volume 12; Issue 4; Pages: 87 (2022)
|
|
BASE
|
|
Show details
|
|
8 |
Vec2Dynamics: A Temporal Word Embedding Approach to Exploring the Dynamics of Scientific Keywords—Machine Learning as a Case Study
|
|
|
|
In: Big Data and Cognitive Computing; Volume 6; Issue 1; Pages: 21 (2022)
|
|
BASE
|
|
Show details
|
|
9 |
Methods, Models and Tools for Improving the Quality of Textual Annotations
|
|
|
|
In: Modelling; Volume 3; Issue 2; Pages: 224-242 (2022)
|
|
BASE
|
|
Show details
|
|
10 |
Creating multi-scripts sentiment analysis lexicons for Algerian, Moroccan and Tunisian dialects
|
|
|
|
In: 7th International Conference on Data Mining (DTMN 2021) Computer Science Conference Proceedings in Computer Science & Information Technology (CS & IT) ; https://hal.archives-ouvertes.fr/hal-03308111 ; 7th International Conference on Data Mining (DTMN 2021) Computer Science Conference Proceedings in Computer Science & Information Technology (CS & IT), Sep 2021, Copenhagen, Denmark (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Bilingual English-German word embedding models for scientific text ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Bilingual English-German word embedding models for scientific text ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
以《Cofacts 真的假的》資料庫為基礎建立中文科學假訊息之探勘模型 ; Text Mining Model for Detecting Chinese Fake Scientific Messages based on Cofacts Open Data
|
|
|
|
Abstract:
一般的查核網站所審核的訊息大部份都是網路上流傳的假訊息,真訊息的比例明顯較少,如直接使用深度學習方法進行學習將會有所偏頗。本研究提議使用科學出版者網站的文章來加入資料集中進行學習。本研究認為語義層面的表示法會優於詞彙層面的表示法,因此也探討了不同的詞向量表示法對偵測中文假新聞的影響。最後,本研究也納入了傳統機器學習法來做比較。經過實驗發現加入科學新聞的資料集,正確率普遍在0.65以下;沒有加入科學新聞的科學類新聞的資料集的正確率反而較高;加入詞向量表示法如Word2vec的正確率可提昇到0.72,Fasttext的正確率可以到0.7,one hot的正確率可以達到0.65;本研究也使用CNN+BiLSTM,比對深度學習方法與傳統機器學習法的效果發現,CNN+BiLSTM的正確率最高為80.38,F1值可以來到0.89;而傳統機器學習法中數值最高的KNN演算法的正確率為75.67,F1值為0.88;其次是J48演算法的正確率為77.63,F1值則是達到0.87,代表深度學習的效果優於傳統機器學習法。 ; The news/messages reviewed by fact-checking websites are mostly fake news/messages circulating on the Internet. The proportion of true news/messages is relatively less. If one applies deep learning methods directly to these uneven data, it will tend to be biased or overfit. This study proposes to use articles obtained from scientific publisher websites to augment the true news/messages of the dataset. In addition, we also contemplated that the semantic representation should be better than the lexical representation. Therefore, we explored the effects of different word embeddings on the detection of Chinese fake news. Finally we compared traditional machine learning methods to deep learning methods on the detection performance. The experimental results showed that extra scientific articles did not help the performance; the accuracy is worse when adding extra scientific articles. Different word embedding methods can have different effects, for example, Word2vec can achieve 0.72, FastText 0.7, and one hot 0.65. Finally with some tweaks we used a network architecture with one layer of CNN and two layers of Bi-directional LSTM that can outperform the accuracy of traditional machine learning methods. The accuracy value and F1 on our CNN+BiLSTM are 80.38 and 0.88, respectively, which are better than the traditional machine learning method as KNN is 75.67 and 0.88, J48 is 77.63 and 0.87. These results coincide with the current findings that deep learning methods do help for fake news/messages detection.
|
|
Keyword:
Machine Learning;Deep Learning;Scientific News;Fake News Filtering;Word Embedding; 機器學習;深度學習;科學新聞;假新聞過濾;詞向量
|
|
URL: http://140.127.82.166/handle/987654321/20972 http://140.127.82.166/bitstream/987654321/20972/1/109NPTU0396002-001.pdf
|
|
BASE
|
|
Hide details
|
|
14 |
Automatic Part-of-Speech Tagging for Security Vulnerability Descriptions ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Automatic Part-of-Speech Tagging for Security Vulnerability Descriptions ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Text ranking based on semantic meaning of sentences ; Textrankning baserad på semantisk betydelse hos meningar
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Efficient Estimate of Low-Frequency Words’ Embeddings Based on the Dictionary: A Case Study on Chinese
|
|
|
|
In: Applied Sciences ; Volume 11 ; Issue 22 (2021)
|
|
BASE
|
|
Show details
|
|
20 |
Acoustic Word Embeddings for End-to-End Speech Synthesis
|
|
|
|
In: Applied Sciences ; Volume 11 ; Issue 19 (2021)
|
|
BASE
|
|
Show details
|
|
|
|