2 |
Finding topic-specific strings in text categorization and opinion mining contexts
|
|
|
|
In: DMIN'10 ; https://hal.archives-ouvertes.fr/hal-01318044 ; DMIN'10 , Jul 2010, Las Vegas, United States (2010)
|
|
Abstract:
International audience ; — In this paper, we present a new probabilistic method for automatically extracting topic-specific strings in a text categorization context. The advantage of this method is twofold. First, it allows us to automatically point out the expressions characterizing a specific topic category for a potential knowledge modelling. Second, it contributes to improve categorization results by providing to the classifier text spans which are more relevant than isolated words. The novelty of our approach relies thus not only on the method used for topic-specific strings extraction but also on the adaptation of the traditional cosine similarity measure for text categorization. We choose for the evaluation to tackle two different challenging corpora: movie reviews of Internet users, and manual transcriptions of call center conversations. On these two tasks, we observed a gain in the categorization results (between 1 and 8%).
|
|
Keyword:
[INFO]Computer Science [cs]; collocations; opinion mining; text categorization; topic-specific strings; weighted cosine
|
|
URL: https://hal.archives-ouvertes.fr/hal-01318044
|
|
BASE
|
|
Hide details
|
|
3 |
Improving Update Summarization by Revisiting the MMR Criterion ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Contextes multilingues alignés pour la désambiguïsation sémantique : une étude expérimentale
|
|
|
|
In: Actes de la 12ème conférence annuelle sur le Traitement Automatique des Langues Naturelles, 6-10 juin 2005 ; TALN 2005 ; https://hal.archives-ouvertes.fr/hal-01073712 ; TALN 2005, 2005, Dourdan, pp. 415--420 (2005)
|
|
BASE
|
|
Show details
|
|
|
|