21 |
Opinion and Polarity Detection within Far-East Languages in NTCIR-7
|
|
|
|
In: http://research.nii.ac.jp/ntcir/workshop/OnlineProceedings7/pdf/NTCIR7/C2/MOAT/21-NTCIR7-MOAT-ZubaryevaO.pdf
|
|
BASE
|
|
Show details
|
|
22 |
Report on CLEF-2002 experiments: Combining multiple sources of evidence
|
|
|
|
In: https://doc.rero.ch/record/17185/files/Savoy_Jacques_-_Report_on_CLEF_2002_Experiments_Combining_20100210.pdf
|
|
BASE
|
|
Show details
|
|
23 |
Searching Strategies for the Bulgarian Language
|
|
|
|
In: http://members.unine.ch/jacques.savoy/Papers/BUIR.pdf
|
|
BASE
|
|
Show details
|
|
24 |
Selecting Automatically the Best Query Translations
|
|
|
|
In: http://riao.free.fr/papers/11.pdf
|
|
BASE
|
|
Show details
|
|
25 |
Monolingual, Bilingual, and GIRT Information Retrieval at CLEF-2005
|
|
|
|
In: http://members.unine.ch/jacques.savoy/Papers/CLEF2005WP.pdf
|
|
BASE
|
|
Show details
|
|
26 |
UniNE at CLEF 2012
|
|
|
|
In: http://www.clef-initiative.eu/documents/71612/901619ab-eeeb-46ad-9188-933ed11a1641/
|
|
BASE
|
|
Show details
|
|
27 |
Abstract Cross-Language Information Retrieval: Experiments Based on CLEF 2000 Corpora
|
|
|
|
In: http://members.unine.ch/jacques.savoy/Papers/CLIRIPM.pdf
|
|
BASE
|
|
Show details
|
|
28 |
Abstract Combining Multiple Strategies for Effective Monolingual and Cross-Language Retrieval
|
|
|
|
In: http://members.unine.ch/jacques.savoy/papers/clirir.pdf
|
|
BASE
|
|
Show details
|
|
29 |
Simple and efficient classification scheme based on specific vocabulary
|
|
|
|
Abstract:
Assuming a binomial distribution for word occurrence, we propose computing a standardized Z score to define the specific vocabulary of a subset compared to that of the entire corpus. This approach is applied to weight terms (character n-gram, word, stem, lemma or sequence of them) which characterize a document. We then show how these Z score values can be used to derive a simple and efficient categorization scheme. To evaluate this proposition and demonstrate its effectiveness, we develop two experiments. First, the system must categorize speeches given by B. Obama as being either electoral or presidential speech. In a second experiment, sentences are extracted from these speeches and then categorized under the headings electoral or presidential. Based on these evaluations, the proposed classification scheme tends to perform better than a support vector machine model for both experiments, on the one hand, and on the other, shows a better performance level than a Naïve Bayes classifier on the first test and a slightly lower performance on the second (10-fold cross validation). Copyright Springer-Verlag 2012 ; Statistics in lexical analysis, Corpus linguistics, Text categorization, Machine learning, Natural language processing (NLP)
|
|
URL: http://hdl.handle.net/10.1007/s10287-012-0149-z
|
|
BASE
|
|
Hide details
|
|
|
|