DE eng

Search in the Catalogues and Directories

Hits 1 – 5 of 5

1
Intelligent Record Linkage Techniques Based on Information Retrieval, Natural Language Processing, and Machine Learning
In: DTIC AND NTIS (2002)
Abstract: The objective of this STTR project is to develop an information management system to rapidly and accurately linking records of related information from web-based information sources. The sheer magnitude of information available online via the Internet has overwhelmed the ability of existing search tools to produce useful query responses. Current web-search techniques typically fail to correlate relevant documents that are identified in different ways, such as synonyms and acronyms (aliases). The challenge is to find an approach that can obtain highly accurate matches even when those documents do not share any obvious attributes with the query, and with minimal information requirement from the user. Latent Semantic Analysis (LSA) is a technique for identifying both semantically similar words and semantically similar documents. On the face of it, LSA should work well for the task of discovering aliases. That is, for a given word we can use LSA to produce a rank-ordered list of words that are semantically similar to it and aliases for the name should be high in this list. In this Phase I, we tested this conjecture empirically and found, surprisingly, that under a broad range of circumstances a straightforward application of LSA fails to rank the aliases highly. We then developed a two-stage algorithm that takes the output of LSA, creates a new set of pseudo-documents, and runs LSA again on these new documents. Empirical results show that this two-stage algorithm performs remarkably well in identifying aliases, even in those cases for which a single application of LSA fails miserably. University of Maryland (Baltimore County) is the research institute partner for this effort, under the direction of Professor Charles Nicholas and Tim Oates. ; Prepared in cooperation with Univ. of Maryland, Baltimore County, Baltimore, MD.
Keyword: *COMPUTATIONAL LINGUISTICS; Computer Programming and Software; Cybernetics; INFORMATION RETRIEVAL; Information Science; KNOWLEDGE MANAGEMENT; LEARNING MACHINES; Linguistics; NATURAL LANGUAGE; SEMANTICS; STTR(SMALL BUSINESS TECHNOLOGY TRANSFER); TEXT PROCESSING; WORD RECOGNITION
URL: http://www.dtic.mil/docs/citations/ADA408937
http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA408937
BASE
Hide details
2
Machine Translation of Battlefield Messages by Lexico-Structural Transfer.
In: DTIC AND NTIS (1997)
BASE
Show details
3
Analogical Explanations
In: DTIC AND NTIS (1990)
BASE
Show details
4
Linguistic-Technical Aspects of Machine Translation (Aspects Linguistiques et Techniques de la Traduction Automatique)
In: DTIC AND NTIS (1988)
BASE
Show details
5
Linguistic Barriers: Translation Problems (La Barriere Linguistique: Problemes de Traduction)
In: DTIC AND NTIS (1988)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
5
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern