DE eng

Search in the Catalogues and Directories

Hits 1 – 4 of 4

1
Blog Fingerprinting: Identifying Anonymous Posts Written by an Author of Interest Using Word and Character Frequency Analysis
In: DTIC (2009)
Abstract: Internet blogs are an easily accessible means of global communications. Monitoring blogs for criminal and terrorist activity is a serious challenge, due to blogs' anonymous nature and the sheer volume of data. The intelligence community is often faced with more information than it can process. The need exists to develop methods for processing the massive amounts of data this media presents, without a significant increase in manpower. An automated tool capable of identifying posts written by an individual, given a sample of his writing, would allow law enforcement and intelligence agencies to gather evidence that would otherwise be overlooked due to manpower and time constraints. This research focuses on identifying blog posts written by a particular author, when we do not have a model of every potential author. Previous research either builds a distinct model for every possible author, or limits itself to large documents. Neither approach is appropriate for processing blog posts. Blog posts tend to be short documents, and building a distinct model of each author is unreasonable if you are looking for one author among millions. We address this problem by combining sample posts by other authors to create a model of an "average author." ; The original document contains color images.
Keyword: *AUTHOR ATTRIBUTION; *AUTHORSHIP ATTRIBUTION; *AUTHORSHIP VERIFICATION; *BLOGS(WEB LOGS); *CRIMINOLOGY; *CYBERTERRORISM; *INTERNET; *NATURAL LANGUAGE; *PATTERN RECOGNITION; *SOCIAL COMMUNICATION; *TEXT PROCESSING; AUTOMATION; BAYES THEOREM; CHARACTER FREQUENCY ANALYSIS; Command; Computer Programming and Software; Control and Communications Systems; Cybernetics; DISTINCTIVE MISSPELLINGS; DOCUMENTS; F-SCORES; FREQUENCY; GLOBAL COMMUNICATIONS; HYPERPLANES; Information Science; INTELLIGENCE; INTERNET COMMUNICATION; INVERSE DOCUMENT FREQUENCY; KOPPEL; KOPPEL'S UNMASKING; LAW ENFORCEMENT; LEARNING MACHINES; Linguistics; MACHINE LEARNING; MANPOWER; Military Intelligence; N-GRAMS; NAIVE BAYES; NATURAL LANGUAGE PROCESSING; POSTS(WEB LOGS); SHORT RANGE(TIME); Sociology and Law; SVM(SUPPORT VECTOR MACHINES); TERM FREQUENCY; TERRORISM; THESES; TOKENIZING; TOOLS; WORD FREQUENCY ANALYSIS
URL: http://www.dtic.mil/docs/citations/ADA508981
http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA508981
BASE
Hide details
2
A Study of Topic and Topic Change in Conversational Threads
In: DTIC (2009)
BASE
Show details
3
Detecting Age in Online Chat
In: DTIC (2009)
BASE
Show details
4
Speech Processing in Realistic Battlefield Environments (Le Traitement de la Parole en Environnement de Combat Realiste)
In: DTIC (2009)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
4
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern