21 |
The Language of Dreams: Application of Linguistics-Based Approaches for the Automated Analysis of Dream Experiences
|
|
|
|
In: Clocks & Sleep ; Volume 3 ; Issue 3 ; Pages 35-514 (2021)
|
|
BASE
|
|
Show details
|
|
22 |
Completing WordNets with Sememe Knowledge
|
|
|
|
In: Electronics; Volume 11; Issue 1; Pages: 79 (2021)
|
|
BASE
|
|
Show details
|
|
23 |
Combination of Time Series Analysis and Sentiment Analysis for Stock Market Forecasting
|
|
|
|
In: Graduate Theses and Dissertations (2021)
|
|
BASE
|
|
Show details
|
|
24 |
Comparing vector document representation methods for authorship identification ; Comparando métodos de representação vectorial de documentos para identificação de autoria
|
|
Quintanilla, Pamela Rosy Revuelta. - : Biblioteca Digital de Teses e Dissertações da USP, 2021. : Universidade de São Paulo, 2021. : Instituto de Matemática e Estatística, 2021
|
|
Abstract:
Over the years the information available in online media has had a great increase. In this sense, the automation of processing languages natural for large amounts of information gained importance, for example, text classification task. It can be used to identify the author (Authorship Identification); however, it requires Machine Learning techniques to identify the author, these techniques have given good results in NLP. In addition, Machine Learning receives the feature vector of the texts, which is extracted using vector document representation methods. The methods proposed for this research are grouped into three different approaches: i) methods based on vector space models, ii) methods based on word embeddings, and iii) methods based on graph embeddings, for this approach, we first model the texts as graphs. On the other hand, not all the methods are used for different languages because they can have different efficiency depending on the language of the analyzed texts. Therefore, the objective of this research is to compare several of these methods using literary texts in English and Spanish. In this way, we analyze whether the methods are efficient to represent several languages or its performance depends on the characteristic of every language. The results showed that the methods of Graph embeddings achieved the best performance for both languages, being that English reached a fairly high success rate. On the other hand, the other methods achieved good performance for English, however, the results for Spanish were not optimal. We believe that the results in Spanish were worse due to the morphological, lexical, and syntactic complexity that this language presents in comparison to English. For this reason, different approaches were compared for the mathematical representation of texts that try to cover the different aspects of a language. ; Com o passar dos anos, as informações disponíveis na mídia online tiveram um grande aumento. Nesse sentido, ganhou importância a automatização de processamento de linguagens natural para grandes quantidades de informação, por exemplo, a tarefa de classificação de textos. Esta tarefa pode ser usada para identificar o autor, atribução de autoria, mas precisa de técnicas de Aprendizado Máquina para identificá-lo, o que têm dado bons resultados no PLN. Além disso, Aprendizado Máquina recebe o vetor característico dos textos os quais são extraídos utilizando métodos de representação vetorial de documentos. Os métodos propostos para esta investigação estão agrupados em três abordagens: i) métodos baseados em modelos de espaço vetorial, ii) métodos baseados em Word embeddings, e iii) métodos baseados em Graph embeddings, para esta abordagem, primeiro modelamos os textos como grafos. Por outro lado, nem todos os métodos são usados para diferentes idiomas, porque pode ter diferentes eficiências, dependendo do idioma dos textos analisados. Então, o objetivo desta pesquisa é comparar vários desses métodos utilizando textos literários em inglês e espanhol. Desta forma, nós analisamos se os métodos são eficientes para representar várias linguagens ou seu desempenho depende das características de cada linguagem. Os resultados mostraram que os métodos de Graph embeddings obtiveram bom desempenho para as duas linguagens, sendo que para o inglês alcançaram uma taxa de sucesso bastante elevada. Por outro lado, os demais métodos obtiveram bom desempenho para o inglês, porém os resultados para o espanhol não foram os ideais. Acreditamos que os resultados em espanhol foram piores devido à complexidade morfológica, lexical e sintática que este idioma apresenta em comparação ao inglês. Por esse motivo, foram comparadas diferentes abordagens para a representação matemática de textos que procuram abranger os diferentes aspectos de uma língua.
|
|
Keyword:
Aprendizado máquina; Atribuição de autoria; Authorship attribution; Classificação de texto; Complex networks; Extração de características; Feature extraction; Graph embedding; Graph embeddings; Machine Learning; Redes complexas; Text classification; Word embeddings
|
|
URL: https://doi.org/10.11606/D.45.2021.tde-05052021-040638 https://www.teses.usp.br/teses/disponiveis/45/45134/tde-05052021-040638/
|
|
BASE
|
|
Hide details
|
|
25 |
Sulle tracce dell’espressione dell’interiorità: analisi diacronica di un corpus di narrativa italiana del XIX-XX secolo
|
|
|
|
BASE
|
|
Show details
|
|
26 |
Improved GloVe Word Embedding Using Linear Weighting Scheme for Word Similarity Tasks
|
|
|
|
In: Electronic Theses and Dissertations (2021)
|
|
BASE
|
|
Show details
|
|
27 |
On The Role of Machine Learning in A Human Learning Process
|
|
|
|
In: Teaching Culturally and Linguistically Diverse International Students in Open or Online Learning Environments: A Research Symposium (2021)
|
|
BASE
|
|
Show details
|
|
28 |
Detection of Hate Speech Spreaders using Convolutional Neural Networks
|
|
|
|
BASE
|
|
Show details
|
|
29 |
Whatever it takes to understand a central banker: Embedding their words using neural networks
|
|
|
|
BASE
|
|
Show details
|
|
30 |
Complete Variable-Length Codes: An Excursion into Word Edit Operations
|
|
|
|
In: LATA 2020 ; https://hal.archives-ouvertes.fr/hal-02389403 ; LATA 2020, Mar 2020, Milan, Italy (2020)
|
|
BASE
|
|
Show details
|
|
31 |
Apprentissage de plongements de mots sur des corpus en langue de spécialité : une étude d’impact
|
|
|
|
In: Actes de la 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL ; 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL ; https://hal.archives-ouvertes.fr/hal-02786198 ; 6e conférence conjointe Journées d'Études sur la Parole (JEP, 33e édition), Traitement Automatique des Langues Naturelles (TALN, 27e édition), Rencontre des Étudiants Chercheurs en Informatique pour le Traitement Automatique des Langues (RÉCITAL, 22e édition). Volume 3 : Rencontre des Étudiants Chercheurs en Informatique pour le TAL, Jun 2020, Nancy, France. pp.164-178 (2020)
|
|
BASE
|
|
Show details
|
|
32 |
Unsupervised cross-lingual representation modeling for variable length phrases ; Apprentissage de représentations cross-lingue d’expressions de longueur variable
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-02938554 ; Computation and Language [cs.CL]. Université de Nantes, 2020. English (2020)
|
|
BASE
|
|
Show details
|
|
33 |
Word Representations Concentrate and This is Good News!
|
|
|
|
In: CoNLL 2020 - 24th Conference on Computational Natural Language Learning ; https://hal.univ-grenoble-alpes.fr/hal-03356609 ; CoNLL 2020 - 24th Conference on Computational Natural Language Learning, Association for Computational Linguistics (ACL), Nov 2020, Online, France. pp.325-334, ⟨10.18653/v1/2020.conll-1.25⟩ (2020)
|
|
BASE
|
|
Show details
|
|
34 |
Implementing Eco’s Model Reader with WordEmbeddings. An Experiment on Facebook Ideological Bots
|
|
|
|
In: JADT - Journées d'analyse des données textuelles ; https://hal.archives-ouvertes.fr/hal-03144105 ; JADT - Journées d'analyse des données textuelles, Jun 2020, Toulouse, France (2020)
|
|
BASE
|
|
Show details
|
|
35 |
Natural language understanding in argumentative dialogue systems ...
|
|
|
|
BASE
|
|
Show details
|
|
36 |
Automatic Creation of Correspondence Table of Meaning Tags from Two Dictionaries in One Language Using Bilingual Word Embedding
|
|
|
|
BASE
|
|
Show details
|
|
37 |
ArAutoSenti: Automatic annotation and new tendencies for sentiment classification of Arabic messages
|
|
|
|
BASE
|
|
Show details
|
|
38 |
French AXA Insurance Word Embeddings : Effects of Fine-tuning BERT and Camembert on AXA France’s data
|
|
Zouari, Hend. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2020
|
|
BASE
|
|
Show details
|
|
39 |
Entropy-Based Approach for the Detection of Changes in Arabic Newspapers’ Content
|
|
|
|
In: Entropy ; Volume 22 ; Issue 4 (2020)
|
|
BASE
|
|
Show details
|
|
40 |
A Framework for Word Embedding Based Automatic Text Summarization and Evaluation
|
|
|
|
In: Information ; Volume 11 ; Issue 2 (2020)
|
|
BASE
|
|
Show details
|
|
|
|