1 |
A Hybrid Method for Opinion Finding Task (KUNLP at TREC 2008 Blog Track)
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
2 |
TREC 2008 at the University at Buffalo: Legal and Blog Track
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
4 |
KLE at TREC 2008 Blog Track: Blog Post and Feed Retrieval
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
6 |
Applying A Formal Language of Command and Control For Interoperability Between Systems
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
7 |
Automating Convoy Training Assessment to Improve Soldier Performance
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
8 |
Barriers, Bridges, and Progress in Cognitive Modeling for Military Applications
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
9 |
Odds of Successful Transfer of Low-level Concepts: A Key Metric for Bidirectional Speech-to-Speech Machine Translation in DARPA's TRANSTAC Program
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
11 |
Integrating a Natural Language Message Pre-Processor with UIMA
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
12 |
CSIR at TREC 2008 Expert Search Task: Modeling Expert Evidence in Expert Search
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
13 |
Support for the Annual Meeting (30th) of the Cognitive Science Society
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
14 |
Laboratory for Computational Cultural Dynamics
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
16 |
Natural Language Dialogue Architectures for Tactical Questioning Characters
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
17 |
IIT Kharagpur at TREC 2008 Blog Track
|
|
|
|
In: DTIC (2008)
|
|
Abstract:
Blogs are often informally written, poorly structured, and filled with spelling and grammatical errors and nontraditional content. Performing linguistic analysis on blogs is plagued by two additional problems: (1) the presence of spam blogs and spam comments, and (2) extraneous noncontent, including blog-rolls, link-rolls, advertisements, and sidebars. Our system of retrieving the documents was made using the Apache Lucene search engine. Lucene was able to index the whole Blog06 dataset and could retrieve the documents very quickly. To decrease the size of the index it was necessary to remove a lot of noise in the HTML. A lot of the documents had malformed html which was corrected using the HTML Tidy utility. We used the qrels of the Blog Track of TREC 2006 and 2007 to train the sentence level subjectivity and polarity classifiers. This paper describes the authors' opinion retrieval system for the TREC 2008 blog track. The system contains five modules. The first module is focused on extracting the blog content from junk html, thereby decreasing the noise in the indexed content. The second module aims at removing various kinds of spam content from real blogs. The third module aims at retrieving relevant documents. The fourth module filters out opinionated documents, and the fifth module calculates the polarity of the sentiments in the documents. The final ranked retrieval runs were based on various combinations of settings in each module so as to study the effects of each. For classification of subjectivity and polarity, they did the predictions by using a complementary naive bayes classifier. ; Presented at the Text REtrieval Conference (17th) (TREC 2008) held in Gaithersburg, MD, on 18-21 Nov 2008. Sponsored in part by the Defense Advanced Research Projects Agency (DARPA) and the Advanced Research and Development Activity (ARDA). The original document contains color images.
|
|
Keyword:
*ATTITUDES(PSYCHOLOGY); *BLOGS; *COMPUTATIONAL LINGUISTICS; *ELECTRONIC PUBLISHING; *EXPERT SYSTEMS; *EXTRACTION; *INFORMATION RETRIEVAL; *INTERNET; *OPINION EXTRACTION; AUTOMATION; BAYES THEOREM; CLASSIFICATION; Cybernetics; DATA BASES; DATA EXTRACTION; DATA PREPROCESSING; FOREIGN REPORTS; HTML(HYPER TECH MARKUP LANGUAGE); INDIA; INFORMATION FILTERS; Information Science; Linguistics; MAP(MEAN AVERAGE PRECISION); MOVIE REVIEWS; NOISE REDUCTION; OPINION FILTERING; OPINION SEARCHES; POLARITY; PRECISION; PREPROCESSING; PUBLIC OPINION; RETRIEVAL PERFORMANCE; SCORING; SPAM FILTERING; SPAM(COMPUTER SCIENCE); SPLOG DETECTION; SYMPOSIA; TREC 2008 BLOG TRACK; WEB LOGS
|
|
URL: http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA512742 http://www.dtic.mil/docs/citations/ADA512742
|
|
BASE
|
|
Hide details
|
|
20 |
Empirical Properties of Multilingual Phone-To-Word Transduction
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
|
|