1 |
Statistical post-editing and quality estimation for machine translation systems
|
|
|
|
In: Béchara, Hanna (2014) Statistical post-editing and quality estimation for machine translation systems. Master of Science thesis, Dublin City University. (2014)
|
|
Abstract:
Statistical post-editing (SPE) has been successfully applied to RBMT systems and, to a less successful extent, to some SMT systems. This thesis investigates the impact of SPE on SMT systems. We apply SPE to an SMT system using a new context-modelling approach to preserve some aspects of source information in the second stage translation. This technique yields mixed results, but fails to consistently improve the output over the baseline. Furthermore, we compared the results to those of an RBMT+SPE system and a pure SMT system, using both automatic and human evaluation methods. Results show that while automatic evaluation metrics favour a pure SMT system, manual evaluators prefer the output provided by the combined RBMT+SPE system. We investigate the use machine learning methods to predict which sentences would benefit from post-editing, however, as the oracle score for both SMT and SMT+SPE was not much higher than the two systems alone, we decided to compare two systems that had a higher upper bound. Combining our analysis with machine learning techniques for quality estimation, we are able to improve the overall output by automatically selecting the best sentences from each of the SMT and RBMT+SPE systems.
|
|
Keyword:
Computational linguistics; Machine translating; Post-editing; Quality Estimation
|
|
URL: http://doras.dcu.ie/19751/
|
|
BASE
|
|
Hide details
|
|
2 |
Statistical Analysis of Alignment Characteristics for Phrase-based Machine Translation
|
|
|
|
In: Proceedings of the 14th European Association for Machine Translation ; https://hal.archives-ouvertes.fr/hal-00525181 ; Proceedings of the 14th European Association for Machine Translation, May 2010, Saint-Raphaël, France. no page number (2010)
|
|
BASE
|
|
Show details
|
|
3 |
HMM word-to-phrase alignment with dependency constraints
|
|
|
|
In: Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2010) HMM word-to-phrase alignment with dependency constraints. In: SSST 2010 - 4th Workshop on Syntax and Structure in Statistical Translation, 28 August 2010, Beijing, China. (2010)
|
|
BASE
|
|
Show details
|
|
4 |
Statistical analysis of alignment characteristics for phrase-based machine translation
|
|
|
|
In: Lambert, Patrik, Petitrenaud, Simon, Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2010) Statistical analysis of alignment characteristics for phrase-based machine translation. In: EAMT 2010 - 14th Annual Conference of the European Association for Machine Translation, 27-28 May 2010, Saint-Raphaël, France. (2010)
|
|
BASE
|
|
Show details
|
|
5 |
Accuracy-based scoring for phrase-based statistical machine translation
|
|
|
|
In: Penkale, Sergio, Ma, Yanjun, Galron, Daniel and Way, Andy orcid:0000-0001-5736-5930 (2010) Accuracy-based scoring for phrase-based statistical machine translation. In: AMTA 2010 - 9th Conference of the Association for Machine Translation in the Americas, 31 October - 4 November 2010, Denver, CO, USA. (2010)
|
|
BASE
|
|
Show details
|
|
6 |
Integrating N-best SMT outputs into a TM system
|
|
|
|
In: He, Yifan, Ma, Yanjun, Way, Andy orcid:0000-0001-5736-5930 and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) Integrating N-best SMT outputs into a TM system. In: COLING 2010 - 23rd International Conference on Computational Linguistics, 23-27 August 2010, Beijing, China. (2010)
|
|
BASE
|
|
Show details
|
|
7 |
Bridging SMT and TM with translation recommendation
|
|
|
|
In: He, Yifan, Ma, Yanjun, van Genabith, Josef orcid:0000-0003-1322-7944 and Way, Andy orcid:0000-0001-5736-5930 (2010) Bridging SMT and TM with translation recommendation. In: ACL 2010 - 48th Annual Meeting of the Association for Computational Linguistics, 11-16 July 2010, Uppsala, Sweden. (2010)
|
|
BASE
|
|
Show details
|
|
8 |
Constrained word alignment models for statistical machine translation
|
|
Ma, Yanjun. - : Dublin City University. National Centre for Language Technology (NCLT), 2009. : Dublin City University. School of Computing, 2009
|
|
In: Ma, Yanjun (2009) Constrained word alignment models for statistical machine translation. PhD thesis, Dublin City University. (2009)
|
|
BASE
|
|
Show details
|
|
9 |
Using supertags as source language context in SMT
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Naskar, Sudip Kumar, Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2009) Using supertags as source language context in SMT. In: EAMT 2009 - 13th Annual Conference of the European Association for Machine Translation, 13-15 May 2009, Barcelona, Spain. (2009)
|
|
BASE
|
|
Show details
|
|
10 |
Source-side context-informed hypothesis alignment for combining outputs from machine translation systems
|
|
|
|
In: Du, Jinhua orcid:0000-0002-3267-4881 , Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2009) Source-side context-informed hypothesis alignment for combining outputs from machine translation systems. In: MT Summit XII - The twelfth Machine Translation Summit, 26-30 August 2009, Ottawa, Canada. (2009)
|
|
BASE
|
|
Show details
|
|
11 |
Bilingually motivated word segmentation for statistical machine translation
|
|
|
|
In: Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2009) Bilingually motivated word segmentation for statistical machine translation. ACM Transactions on Asian Language Information Processing, 8 (2). ISSN 1530-0226 (2009)
|
|
BASE
|
|
Show details
|
|
12 |
Tracking relevant alignment characteristics for machine translation
|
|
|
|
In: Lambert, Patrik, Ma, Yanjun, Ozdowska, Sylwia and Way, Andy orcid:0000-0001-5736-5930 (2009) Tracking relevant alignment characteristics for machine translation. In: MT Summit XII - The twelfth Machine Translation Summit, 26-30 August 2009, Ottawa, Canada. (2009)
|
|
BASE
|
|
Show details
|
|
13 |
Bilingually motivated domain-adapted word segmentation for statistical machine translation
|
|
|
|
In: Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2009) Bilingually motivated domain-adapted word segmentation for statistical machine translation. In: EACL 2009 Workshop on Computational Approaches to Semitic Languages, 31 March 2009, Athens, Greece. (2009)
|
|
BASE
|
|
Show details
|
|
14 |
Improving word alignment using syntactic dependencies
|
|
|
|
In: Ma, Yanjun, Ozdowska, Sylwia, Sun, Yanli and Way, Andy orcid:0000-0001-5736-5930 (2008) Improving word alignment using syntactic dependencies. In: ACL08-SSST - Proceedings of ACL08 workshop on Syntax and Structure in Statistical Translation, 20 June 2008, Columbus, Ohio, USA. (2008)
|
|
BASE
|
|
Show details
|
|
15 |
Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008
|
|
|
|
In: Ma, Yanjun, Tinsley, John, Hassan, Hany, Du, Jinhua orcid:0000-0002-3267-4881 and Way, Andy orcid:0000-0001-5736-5930 (2008) Exploiting alignment techniques in MATREX: the DCU machine translation system for IWSLT 2008. In: IWSLT 2008 - International Workshop on Spoken Language Translation, 20-21 October 2008, Hawaii, USA. (2008)
|
|
BASE
|
|
Show details
|
|
16 |
MATREX: the DCU MT System for WMT 2008
|
|
|
|
In: Tinsley, John, Ma, Yanjun, Ozdowska, Sylwia and Way, Andy orcid:0000-0001-5736-5930 (2008) MATREX: the DCU MT System for WMT 2008. In: ACL08-SMT - Proceedings of ACL08 workshop on Statistical Machine Translation, 19 June 2008, Columbus, Ohio, USA. (2008)
|
|
BASE
|
|
Show details
|
|
17 |
An investigation of question translation for English-Chinese cross-language question answering
|
|
|
|
In: Zhang, Ying, Jones, Gareth J.F. orcid:0000-0003-2923-8365 , Zhang, Sen, Wang, Bin, Guo, Yuqing and Ma, Yanjun (2007) An investigation of question translation for English-Chinese cross-language question answering. In: CIICT 2007 - Proceedings of the China-Ireland International Conference on Information and Communications Technologies, 28-29 August 2007, Dublin, Ireland. (2007)
|
|
BASE
|
|
Show details
|
|
18 |
Alignment-guided chunking
|
|
|
|
In: Ma, Yanjun, Stroppa, Nicolas and Way, Andy orcid:0000-0001-5736-5930 (2007) Alignment-guided chunking. In: TMI-07 - Proceedings of The 11th Conference on Theoretical and Methodological Issues in Machine Translation, 7-9 September 2007, Skövde, Sweden. (2007)
|
|
BASE
|
|
Show details
|
|
19 |
Bootstrapping word alignment via word packing
|
|
|
|
In: Ma, Yanjun, Stroppa, Nicolas and Way, Andy orcid:0000-0001-5736-5930 (2007) Bootstrapping word alignment via word packing. In: ACL 2007 - Proceedings of the 45th Annual Meeting of the Association for Computational Linguistics, 23-30 June 2007, Prague, Czech Republic. (2007)
|
|
BASE
|
|
Show details
|
|
|
|