1 |
Is Attention always needed? A Case Study on Language Identification from Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Classifier Combination Approach for Question Classification for Bengali Question Answering System ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
A hybrid approach for transliterated word-level language identification: CRF with post processing heuristics
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Experiments on domain adaptation for English-Hindi SMT
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Naskar, Sudip Kumar, van Genabith, Josef orcid:0000-0003-1322-7944 and Way, Andy orcid:0000-0001-5736-5930 (2009) Experiments on domain adaptation for English-Hindi SMT. In: PACLIC 23 - the 23rd Pacific Asia Conference on Language, Information and Computation, 3-5 December 2009, Hong Kong. (2009)
|
|
Abstract:
Statistical Machine Translation (SMT) systems are usually trained on large amounts of bilingual text and monolingual target language text. If a significant amount of out-of-domain data is added to the training data, the quality of translation can drop. On the other hand, training an SMT system on a small amount of training material for given indomain data leads to narrow lexical coverage which again results in a low translation quality. In this paper, (i) we explore domain-adaptation techniques to combine large out-of-domain training data with small-scale in-domain training data for English—Hindi statistical machine translation and (ii) we cluster large out-of-domain training data to extract sentences similar to in-domain sentences and apply adaptation techniques to combine clustered sub-corpora with in-domain training data into a unified framework, achieving a 0.44 absolute corresponding to a 4.03% relative improvement in terms of BLEU over the baseline.
|
|
Keyword:
domain adaptation; Machine translating; statistical machine translation
|
|
URL: http://doras.dcu.ie/15175/
|
|
BASE
|
|
Hide details
|
|
8 |
Dependency relations as source context in phrase-based SMT
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Naskar, Sudip Kumar, van den Bosch, Antal and Way, Andy orcid:0000-0001-5736-5930 (2009) Dependency relations as source context in phrase-based SMT. In: PACLIC 23 - the 23rd Pacific Asia Conference on Language, Information and Computation, 3-5 December 2009, Hong Kong. (2009)
|
|
BASE
|
|
Show details
|
|
9 |
Using supertags as source language context in SMT
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Naskar, Sudip Kumar, Ma, Yanjun and Way, Andy orcid:0000-0001-5736-5930 (2009) Using supertags as source language context in SMT. In: EAMT 2009 - 13th Annual Conference of the European Association for Machine Translation, 13-15 May 2009, Barcelona, Spain. (2009)
|
|
BASE
|
|
Show details
|
|
10 |
English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009
|
|
|
|
In: Haque, Rejwanul orcid:0000-0003-1680-0099 , Dandapat, Sandipan, Srivastava, Ankit Kumar, Naskar, Sudip Kumar and Way, Andy orcid:0000-0001-5736-5930 (2009) English-Hindi transliteration using context-informed PB-SMT: the DCU system for NEWS 2009. In: NEWS 2009 - Named Entities Workshop, 7 August 2009, Singapore. (2009)
|
|
BASE
|
|
Show details
|
|
|
|