44 |
Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004
|
|
|
|
In: DTIC (2004)
|
|
BASE
|
|
Show details
|
|
45 |
The Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data
|
|
|
|
In: DTIC (2004)
|
|
BASE
|
|
Show details
|
|
51 |
Context-Based Machine Translation ...
|
|
|
|
Abstract:
Context-Based Machine Translation™ (CBMT) is a new paradigm for corpusbased translation that requires no parallel text. Instead, CBMT relies on a lightweight translation model utilizing a fullform bilingual dictionary and a sophisticated decoder using long-range context via long n-grams and cascaded overlapping. The translation process is enhanced via in-language substitution of tokens and phrases, both for source and target, when top candidates cannot be confirmed or resolved in decoding. Substitution utilizes a synonym and near-synonym generator implemented as a corpus-based unsupervised learning process. Decoding requires a very large target-language-only corpus, and while substitution in target can be performed using that same corpus, substitution in source requires a separate (and smaller) source monolingual corpus. Spanish-to-English CBMT was tested on Spanish newswire text, achieving a BLEU score of 0.6462 in June 2006, the highest BLEU reported for any language pair. Further testing also shows that ...
|
|
Keyword:
89999 Information and Computing Sciences not elsewhere classified; FOS Computer and information sciences
|
|
URL: https://kilthub.cmu.edu/articles/Context-Based_Machine_Translation/6604451/1 https://dx.doi.org/10.1184/r1/6604451.v1
|
|
BASE
|
|
Hide details
|
|
|
|