2 |
IMMERSE: Interactive Mentoring for Multimodal Experiences in Realistic Social Encounters
|
|
|
|
In: DTIC (2015)
|
|
BASE
|
|
Show details
|
|
3 |
Methods for Evaluating Text Extraction Toolkits: An Exploratory Investigation
|
|
|
|
In: DTIC (2015)
|
|
BASE
|
|
Show details
|
|
4 |
Impact of Machine-Translated Text on Entity and Relationship Extraction
|
|
|
|
In: DTIC (2014)
|
|
BASE
|
|
Show details
|
|
5 |
Enabling Efficient Intelligence Analysis in Degraded Environments
|
|
|
|
In: DTIC (2013)
|
|
BASE
|
|
Show details
|
|
6 |
Expanding the Toolkit and Resource Environment to Assist Translation (TREAT) and Its User Base
|
|
|
|
In: DTIC (2011)
|
|
BASE
|
|
Show details
|
|
7 |
Introduction of Automation for the Production of Bilingual, Parallel-Aligned Text
|
|
|
|
In: DTIC (2011)
|
|
Abstract:
As the study and application of statistical machine translation (SMT) grows, progress is often circumscribed by a lack of data. The statistical models that govern statistical machine translation (SMT) engines rely on many large bilingual text corpora, each comprised of vast numbers of bilingual text segments. For certain languages, corpora already exist and help to power translation engines. Regrettably, this is not the case for every language the Army is interested in, making the creation or acquisition of such data a priority. To this end, a language expert in Dari and Pashto was hired, who collected, prepared, and ensured the quality of bilingual text. To explore ways in which to aid the expert, a variety of the steps performed by the expert and necessary to the process were automated. The hypothesis was that automation of selected processes would improve efficiency, measured in terms of both speed of production and quantity of data produced, even when time to correct automation-caused errors was accounted for. As predicted, the net result of introducing automation was an increase in both the rate of producing correct bilingual segments and the number produced. The implications of these results for improving larger bilingual data creation and acquisition efforts are discussed. ; The original document contains color images.
|
|
Keyword:
*AFGHANISTAN; *AUTOMATION; *BILINGUAL DATA PRODUCTION; *BILINGUAL PARALLEL TEXT; *DATA MINING; *ENGLISH LANGUAGE; *FOREIGN LANGUAGES; *MACHINE TRANSLATION; *STATISTICAL ANALYSIS; *STATISTICAL MACHINE TRANSLATION; ACCURACY; ALIGNMENT; ARMY OPERATIONS; ARMY PERSONNEL; Computer Programming and Software; DARI LANGUAGE; DARI-ENGLISH TRANSLATION; EFFICIENCY; Linguistics; NATURAL LANGUAGE; PARSERS; PASHTO LANGUAGE; PASHTO-ENGLISH TRANSLATION; PIPELINE PROJECT; PRODUCTION; SEGMENTATION; SOFTWARE TOOLS; Statistics and Probability
|
|
URL: http://www.dtic.mil/docs/citations/ADA552756 http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA552756
|
|
BASE
|
|
Hide details
|
|
8 |
Enhancing a Web Crawler with Arabic Search Capability
|
|
|
|
In: DTIC (2010)
|
|
BASE
|
|
Show details
|
|
9 |
Entity Profiling for Intelligence Using the Graphical Overview of Social and Semantic Interactions of People (GOSSIP) Software Tool
|
|
|
|
In: DTIC (2010)
|
|
BASE
|
|
Show details
|
|
11 |
Blog Fingerprinting: Identifying Anonymous Posts Written by an Author of Interest Using Word and Character Frequency Analysis
|
|
|
|
In: DTIC (2009)
|
|
BASE
|
|
Show details
|
|
12 |
CEMAP II: An Architecture and Specifications to Facilitate the Importing of Real-World Data into the CASOS Software Suite
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
13 |
A Sensemaking Visualization Tool with Military Doctrinal Elements
|
|
|
|
In: DTIC (2008)
|
|
BASE
|
|
Show details
|
|
14 |
Work-Centered Approach to Insurgency Campaign Analysis
|
|
|
|
In: DTIC (2007)
|
|
BASE
|
|
Show details
|
|
15 |
An Analysis of Specware and Its Usefulness in the Verification of High Assurance Systems
|
|
|
|
In: DTIC (2006)
|
|
BASE
|
|
Show details
|
|
16 |
Information Visualization: The State of the Art for Maritime Domain Awareness
|
|
|
|
In: DTIC (2006)
|
|
BASE
|
|
Show details
|
|
17 |
Security Ontology for Annotating Resources
|
|
|
|
In: DTIC AND NTIS (2005)
|
|
BASE
|
|
Show details
|
|
18 |
DARPA Agent Markup Language (DAML) Unified Modeling Language (UML)-Based Ontology Toolset (UBOT)
|
|
|
|
In: DTIC (2005)
|
|
BASE
|
|
Show details
|
|
19 |
Naturally Speaking: A Systems Biology Tool With Natural Language Interfaces
|
|
|
|
In: DTIC (2004)
|
|
BASE
|
|
Show details
|
|
20 |
Evaluation of Transcription and Annotation Tools for a Multi-Modal, Multi-Party Dialogue Corpus
|
|
|
|
In: DTIC (2004)
|
|
BASE
|
|
Show details
|
|
|
|