1 |
Relevance Feedback based on Constrained Clustering: FDU at TREC 09
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
2 |
A Journey in Entity Related Retrieval for TREC 2009
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
3 |
Lucene for n-grams using the ClueWeb Collection
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
4 |
BIT at TREC 2009 Faceted Blog Distillation Task
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
5 |
IRRA at TREC 2009: Index Term Weighting based on Divergence From Independence Model
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
6 |
POSTECH at TREC 2009 Blog Track: Top Stories Identification
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
7 |
PRIS at 2009 Relevance Feedback track: Experiments in Language Model for Relevance Feedback
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
8 |
Experiments on Related Entity Finding Track at TREC 2009
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
9 |
Facet Classification of Blogs: Know-Center at the TREC 2009 Blog Distillation Task
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
10 |
Techniques for Automatically Generating Biographical Summaries from News Articles
|
|
|
|
In: DTIC (2007)
|
|
BASE
|
|
Show details
|
|
11 |
A Methodology for End-to-End Evaluation of Arabic Document Image Processing Software
|
|
|
|
In: DTIC (2006)
|
|
Abstract:
This paper describes a methodology for end-to-end evaluation of Arabic document image processing software. Various software solutions have been proposed for digitization and understanding of noisy, complex Arabic document images. Optical-character-recognition-based (OCR-based) solutions have been available for decades; however this technology is often tailored to the most common document image type: clean, monolingual documents. Real-world documents often involve multiple languages, handwriting, logos, signatures, pictures, stylized text, and other document aspects. Real-world documents involve noise introduced by document aging, reproduction, or exposure to environment factors. Document image processing solutions are maturing to deal with such complexities. Such systems include image clean-up algorithms and page segmentation, followed by various recognition or digitization algorithms: OCR, handwritten word recognition (HWR), logo identification, signature identification, sub-image or picture identification. Indexing digitized document renditions into a search engine enables ad hoc querying of the collection. Some researchers have proposed semi-automation, a process in which human readers interpret complex documents and record a spoken rendition; the audio recordings are then processed by a spoken document retrieval (SDR) system, employing automatic speech recognition (ASR) for digitization and an information retrieval solution to enable ad hoc queries. To handle foreign language, machine translation may be included in any of the aforementioned document image processing systems. This array of approaches results in widely varying performance. This paper discusses a methodology for evaluating the end-to-end retrieval performance of these systems: the ad-hoc use case. The methodology can be easily tailored to other languages, and to other document formats (e.g., audio and video). ; The original document contains color images.
|
|
Keyword:
*ARABIC LANGUAGE; *COMPUTER PROGRAMS; *IMAGE PROCESSING; *INFORMATION RETRIEVAL; *METHODOLOGY; *RETRIEVAL PERFORMANCE; *SOFTWARE EVALUATION METHODOLOGY; *TEST AND EVALUATION; AD HOC USE CASE; ALGORITHMS; ASR(AUTOMATIC SPEECH RECOGNITION); CLEAN-UP ALGORITHMS; Computer Programming and Software; DIGITAL IMAGES; DIGITIZATION; Equipment and Methods; HWR(HANDWRITTEN WORD RECOGNITION); ILLEGIBLE DOCUMENTS; IMAGE PROCESSING SOFTWARE; Linguistics; MACHINE TRANSLATION; MULTILINGUAL DOCUMENTS; NOISY DOCUMENT IMAGES; OPTICAL CHARACTER RECOGNITION; PAGE SEGMENTATION; PERFORMANCE(ENGINEERING); PICTURES; PRECISION; RETRIEVAL PRECISION; RETRIEVAL RECALL; SDR(SPOKEN DOCUMENT RETRIEVAL); SPEECH RECOGNITION; SYMBOLS; Test Facilities; TREC MEASURES; TREC(TEXT RETRIEVAL CONFERENCES); WEAR; WORD RECOGNITION
|
|
URL: http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA468394 http://www.dtic.mil/docs/citations/ADA468394
|
|
BASE
|
|
Hide details
|
|
12 |
NTCIR CLIR Experiments at the University of Maryland
|
|
|
|
In: DTIC (2000)
|
|
BASE
|
|
Show details
|
|
13 |
Experiments in Spoken Document Retrieval at CMU
|
|
|
|
In: DTIC (1997)
|
|
BASE
|
|
Show details
|
|
17 |
Tipster Shogun System (Joint GE-CMU): MUC-4 Test Results and Analysis
|
|
|
|
In: DTIC (1992)
|
|
BASE
|
|
Show details
|
|
18 |
GE-CMU: Description of the Tipster/Shogun System as Used for MUC-4
|
|
|
|
In: DTIC (1992)
|
|
BASE
|
|
Show details
|
|
20 |
BBN HARC and DELPHI Results on the ATIS Benchmarks - February 1991
|
|
|
|
In: DTIC (1991)
|
|
BASE
|
|
Show details
|
|
|
|