4 |
Adding Interpretable Attention to Neural Translation Models Improves Word Alignment ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Comparison of Data Selection Techniques for the Translation of Video Lectures
|
|
|
|
In: The eleventh biennial conference of the Association for Machine Translation in the Americas (AMTA-2014) ; https://hal.archives-ouvertes.fr/hal-01157888 ; The eleventh biennial conference of the Association for Machine Translation in the Americas (AMTA-2014), AMTA, Oct 2014, Vancouver, Canada (2014)
|
|
Abstract:
International audience ; For the task of online translation of scientific video lectures, using huge models is not possible. In order to get smaller and efficient models, we perform data selection. In this paper, we perform a qualitative and quantitative comparison of several data selection techniques, based on cross-entropy and infrequent n-gram criteria. In terms of BLEU, a combination of translation and language model cross-entropy achieves the most stable results. As another important criterion for measuring translation quality in our application, we identify the number of out-of-vocabulary words. Here, infrequent n-gram recovery shows superior performance. Finally, we combine the two selection techniques in order to benefit from both their strengths.
|
|
Keyword:
[INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; Domain Adaptation; Speech translation; Statistical machine translation
|
|
URL: https://hal.archives-ouvertes.fr/hal-01157888/file/paperAMTA.pdf https://hal.archives-ouvertes.fr/hal-01157888 https://hal.archives-ouvertes.fr/hal-01157888/document
|
|
BASE
|
|
Hide details
|
|
|
|