DE eng

Search in the Catalogues and Directories

Hits 1 – 8 of 8

1
A Hybrid Method for Opinion Finding Task (KUNLP at TREC 2008 Blog Track)
In: DTIC (2008)
BASE
Show details
2
DCU at the TREC 2008 Blog Track
In: DTIC (2008)
BASE
Show details
3
Barriers, Bridges, and Progress in Cognitive Modeling for Military Applications
In: DTIC (2008)
BASE
Show details
4
Integrating a Natural Language Message Pre-Processor with UIMA
In: DTIC (2008)
BASE
Show details
5
Laboratory for Computational Cultural Dynamics
In: DTIC (2008)
BASE
Show details
6
Semantical Machine Understanding
In: DTIC (2008)
BASE
Show details
7
IIT Kharagpur at TREC 2008 Blog Track
In: DTIC (2008)
Abstract: Blogs are often informally written, poorly structured, and filled with spelling and grammatical errors and nontraditional content. Performing linguistic analysis on blogs is plagued by two additional problems: (1) the presence of spam blogs and spam comments, and (2) extraneous noncontent, including blog-rolls, link-rolls, advertisements, and sidebars. Our system of retrieving the documents was made using the Apache Lucene search engine. Lucene was able to index the whole Blog06 dataset and could retrieve the documents very quickly. To decrease the size of the index it was necessary to remove a lot of noise in the HTML. A lot of the documents had malformed html which was corrected using the HTML Tidy utility. We used the qrels of the Blog Track of TREC 2006 and 2007 to train the sentence level subjectivity and polarity classifiers. This paper describes the authors' opinion retrieval system for the TREC 2008 blog track. The system contains five modules. The first module is focused on extracting the blog content from junk html, thereby decreasing the noise in the indexed content. The second module aims at removing various kinds of spam content from real blogs. The third module aims at retrieving relevant documents. The fourth module filters out opinionated documents, and the fifth module calculates the polarity of the sentiments in the documents. The final ranked retrieval runs were based on various combinations of settings in each module so as to study the effects of each. For classification of subjectivity and polarity, they did the predictions by using a complementary naive bayes classifier. ; Presented at the Text REtrieval Conference (17th) (TREC 2008) held in Gaithersburg, MD, on 18-21 Nov 2008. Sponsored in part by the Defense Advanced Research Projects Agency (DARPA) and the Advanced Research and Development Activity (ARDA). The original document contains color images.
Keyword: *ATTITUDES(PSYCHOLOGY); *BLOGS; *COMPUTATIONAL LINGUISTICS; *ELECTRONIC PUBLISHING; *EXPERT SYSTEMS; *EXTRACTION; *INFORMATION RETRIEVAL; *INTERNET; *OPINION EXTRACTION; AUTOMATION; BAYES THEOREM; CLASSIFICATION; Cybernetics; DATA BASES; DATA EXTRACTION; DATA PREPROCESSING; FOREIGN REPORTS; HTML(HYPER TECH MARKUP LANGUAGE); INDIA; INFORMATION FILTERS; Information Science; Linguistics; MAP(MEAN AVERAGE PRECISION); MOVIE REVIEWS; NOISE REDUCTION; OPINION FILTERING; OPINION SEARCHES; POLARITY; PRECISION; PREPROCESSING; PUBLIC OPINION; RETRIEVAL PERFORMANCE; SCORING; SPAM FILTERING; SPAM(COMPUTER SCIENCE); SPLOG DETECTION; SYMPOSIA; TREC 2008 BLOG TRACK; WEB LOGS
URL: http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA512742
http://www.dtic.mil/docs/citations/ADA512742
BASE
Hide details
8
Iterated Class-Specific Subspaces for Speaker-Dependent Phoneme Classification
In: DTIC (2008)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
8
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern