Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 10 of 10

1	Lexical Link Analysis Application: Improving Web Service to Acquisition Visibility Portal
	Zhao, Ying; Gallup, Shelley P; MacKinnon, Douglas J
	In: DTIC (2013)
	BASE
	Show details

2	Novel Topic Impact on Authorship Attribution
	Caver, Johnnie F.
	In: DTIC (2009)
	BASE
	Show details

3	Techniques for Automatically Generating Biographical Summaries from News Articles
	Esparza, Matthew W.
	In: DTIC (2007)
	BASE
	Show details

4	Stochastic Language Generation in a Dialogue System: Toward a Domain Independent Generator
	Chambers, Nathanael; Allen, James
	In: DTIC (2004)
	BASE
	Show details

5	The Bible, Truth, and Multilingual OCR Evaluation
	Kanungo, Tapas; Resnik, Philip
	In: DTIC (1998)
	Abstract: Multilingual OCR has emerged as an important information technology, thanks to the increasing need for cross-language information access. While many research groups and companies have developed OCR algorithms for various languages, it is difficult to compare the performance of these OCR algorithms across languages. This difficulty arises because most evaluation methodologies rely on the use of a document image dataset in each of the languages and it is difficult to find document datasets in different languages that are similar in content and layout. In this paper we propose to use the Bible as a dataset for comparing OCR accuracy across languages. Besides being available in a wide range of languages, Bible translation are closely parallel in content, carefully translated, surprisingly relevant with respect to modern-day language, and quite inexpensive. A project at the University of Maryland is currently implementing this idea. We have created a scanned image dataset with groundtruth from an Arabic Bible. We have also used image degradation models to create synthetically degraded images of a French Bible. We hope to generate similar Bible datasets for other languages, and we are exploring alternative corpora such as the Koran and the Bhagavad Gita that have similar properties. Quantitative OCR evaluation based on the Arabic Bible dataset is currently in progress. ; Sponsored in part by DARPA and Army Research Lab. Report no. CS-TR-3967. Presented at the SPIE Conference on Document Recognition and Retrieval VI held in San Jose, CA on 27-28 Jan 1999. Published in the Proceedings of the SPIE Conference on Document Recognition and Retrieval VI, Proceedings of SPIE, v3651, 1999.
	Keyword: BIBLE; CORPUS; DATASETS; GROUNDTRUTH; OPTICAL CHARACTER RECOGNITION; TEST SETS; *TRANSLATIONS; ACCURACY; ALGORITHMS; Cybernetics; DOCUMENT IMAGES; DOCUMENTS; IMAGES; Information Science; LANGUAGE; Linguistics; MULTILINGUAL OCR(OPTICAL CHARACTER RECOGNITION); SYMPOSIA; TEST AND EVALUATION
	URL: http://www.dtic.mil/docs/citations/ADA458666 http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA458666
	BASE
	Hide details

6	Tipster Shogun System (Joint GE-CMU): MUC-4 Test Results and Analysis
	Krupka, George; Jaco, Paul; Mauldin, Michael...
	In: DTIC (1992)
	BASE
	Show details

7	GE-CMU: Description of the Tipster/Shogun System as Used for MUC-4
	Jacobs, Paul; Krupka, George; Rau, Lisa...
	In: DTIC (1992)
	BASE
	Show details

8	BBN PLUM: MUC-4 Test Results and Analysis
	Weischedel, Ralph; Ayuso, Damaris; Boisen, Sean...
	In: DTIC (1992)
	BASE
	Show details

9	BBN HARC and DELPHI Results on the ATIS Benchmarks - February 1991
	Austin, S.; Ayuso, D.; Bates, M....
	In: DTIC (1991)
	BASE
	Show details

10	BBN PLUM: MUC-3 Test Results and Analysis
	Weischedel, Ralph; Ayuso, Damaris; Boisen, Sean...
	In: DTIC (1991)
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern