1 |
Deep interactive text prediction and quality estimation in translation interfaces
|
|
|
|
In: Hokamp, Christopher M. (2018) Deep interactive text prediction and quality estimation in translation interfaces. PhD thesis, Dublin City University. (2018)
|
|
BASE
|
|
Show details
|
|
2 |
Statistical post-editing and quality estimation for machine translation systems
|
|
|
|
In: Béchara, Hanna (2014) Statistical post-editing and quality estimation for machine translation systems. Master of Science thesis, Dublin City University. (2014)
|
|
BASE
|
|
Show details
|
|
3 |
Predicting sentence translation quality using extrinsic and language independent features
|
|
|
|
In: Bicici, Ergun, Groves, Declan and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Predicting sentence translation quality using extrinsic and language independent features. Machine Translation, 27 (3-4). pp. 171-192. ISSN 0922-6567 (2013)
|
|
BASE
|
|
Show details
|
|
4 |
Domain adaptation for statistical machine translation of corporate and user-generated content
|
|
|
|
In: Banerjee, Pratyush (2013) Domain adaptation for statistical machine translation of corporate and user-generated content. PhD thesis, Dublin City University. (2013)
|
|
BASE
|
|
Show details
|
|
5 |
CNGL: Grading student answers by acts of translation
|
|
|
|
In: Bicici, Ergun orcid:0000-0002-2293-2031 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) CNGL: Grading student answers by acts of translation. In: SEMEVAL, 14-15 Jun 2013, Atlanta, Georgia. (2013)
|
|
BASE
|
|
Show details
|
|
6 |
Definition of interfaces
|
|
|
|
In: Almaghout, Hala, Bicici, Ergun, Doherty, Stephen orcid:0000-0003-0887-1049 , Gaspari, Federico, Groves, Declan, Toral, Antonio orcid:0000-0003-2357-2960 , van Genabith, Josef orcid:0000-0003-1322-7944 , Popović, Maja orcid:0000-0001-8234-8745 and Piperidis, Stelios (2013) Definition of interfaces. Project Report. QTLaunchPad. (2013)
|
|
BASE
|
|
Show details
|
|
7 |
Mapping the industry I: Findings on translation technologies and quality assessment
|
|
|
|
In: Doherty, Stephen orcid:0000-0003-0887-1049 , Gaspari, Federico, Groves, Declan and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Mapping the industry I: Findings on translation technologies and quality assessment. Technical Report. GALA. (2013)
|
|
BASE
|
|
Show details
|
|
8 |
Quality metrics for human and machine translation.
|
|
|
|
In: Doherty, Stephen orcid:0000-0003-4864-5986 , Gaspari, Federico, Groves, Declan, Srivastava, Ankit Kumar and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) Quality metrics for human and machine translation. Project Report. UNSPECIFIED. (2013)
|
|
BASE
|
|
Show details
|
|
9 |
CNGL-CORE: Referential translation machines for measuring semantic similarity
|
|
|
|
In: Bicici, Ergun orcid:0000-0002-2293-2031 and van Genabith, Josef orcid:0000-0003-1322-7944 (2013) CNGL-CORE: Referential translation machines for measuring semantic similarity. In: *SEM, 13-14 Jun 2013, Atlanta, Georgia. (2013)
|
|
BASE
|
|
Show details
|
|
10 |
Decreasing lexical data sparsity in statistical syntactic parsing - experiments with named entities
|
|
|
|
In: Hogan, Deirdre, Foster, Jennifer orcid:0000-0002-7789-4853 and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) Decreasing lexical data sparsity in statistical syntactic parsing - experiments with named entities. In: Multiword Expressions: from Parsing and Generation to the Real World (MWE). Workshop at ACL 2011, 19-24 June 2011, Portland, Oregon. (2011)
|
|
BASE
|
|
Show details
|
|
11 |
Deep Syntax in Statistical Machine Translation
|
|
Graham, Yvette. - : Dublin City University. National Centre for Language Technology (NCLT), 2011. : Dublin City University. School of Computing, 2011
|
|
In: Graham, Yvette (2011) Deep Syntax in Statistical Machine Translation. PhD thesis, Dublin City University. (2011)
|
|
Abstract:
Statistical Machine Translation (SMT) via deep syntactic transfer employs a three-stage architecture, (i) parse source language (SL) input, (ii) transfer SL deep syntactic structure to the target language (TL), and (iii) generate a TL translation. The deep syntactic transfer architecture achieves a high level of language pair independence compared to other Machine Translation (MT) approaches, as translation is carried out at the more language independent deep syntactic representation. TL word order can be generated independently of SL word order and therefore no reordering model between source and target words is required. In addition, words in dependency relations are adjacent in the deep syntactic structure, allowing the extraction of more general transfer rules, compared to other rules/phrases extracted from the surface form corpus, as such words are often distant in surface form strings, as well as allowing the use of a TL deep syntax language model, which models a deeper notion of fluency than a string-based language model and may lead to better lexical choice. The deep syntactic representation also contains words in lemma form with morpho-syntactic information, and this enables new inflections of lemmas not observed in bilingual training data, that are out of coverage for other SMT approaches, to fall within coverage of deep syntactic transfer. In this thesis, we adapt existing methods already successful in Phrase-Based SMT (PB-SMT) to deep syntactic transfer as well as presenting new methods of our own. We present a new definition for consistent deep syntax transfer rules, inspired by the definition for a consistent phrase in PB-SMT, and we extract all rules consistent with the node alignment, as smaller rules provide high coverage of unseen data, while larger rules provide more fluent combinations of TL words. Since large numbers of consistent transfer rules exist per sentence pair, we also provide an efficient method of extracting rules as well as an efficient method of storing them. We also present a deep syntax translation model, as in other SMT approaches, we use a log-linear combination of features functions, and include a translation model computed from relative frequencies of transfer rules, lexical weighting, as well as a deep syntax language model and string-based language model. In addition, we describe methods of carrying out transfer decoding, the search for TL deep syntactic structures, and how we efficiently integrate a deep syntax trigram language model to decoding, as well as methods of translating morpho-syntactic information separately from lemmas, using an adaptation of Factored Models. Finally, we include an experimental evaluation, in which we compare MT output for different configurations of our SMT via deep syntactic transfer system. We investigate various methods of word alignment, methods of translating morpho-syntactic information, limits on transfer rule size, different beam sizes during transfer decoding, generating from different sized lists of TL decoder output structures, as well as deterministic versus non-deterministic generation. We also include an evaluation of the deep syntax language model in isolation to the MT system and compare it to a string-based language model. Finally, we compare the performance and types of translations our system produces with a state-of-the-art phrase-based statistical machine translation system and although the deep syntax system in general currently under-performs, it does achieve state-of-the-art performance for translation of a specific syntactic construction, the compound noun, and for translations within coverage of the TL precision grammar used for generation. We provide the software for transfer rule extraction, as well as the transfer decoder, as open source tools to assist future research.
|
|
Keyword:
Lexical Functional Grammar; Machine translating
|
|
URL: http://doras.dcu.ie/16078/
|
|
BASE
|
|
Hide details
|
|
12 |
The integration of machine translation and translation memory
|
|
He, Yifan. - : Dublin City University. Centre for Next Generation Localisation (CNGL), 2011. : Dublin City University. School of Computing, 2011
|
|
In: He, Yifan (2011) The integration of machine translation and translation memory. PhD thesis, Dublin City University. (2011)
|
|
BASE
|
|
Show details
|
|
13 |
Improving dependency label accuracy using statistical post-editing: A cross-framework study
|
|
|
|
In: Cetinoglu, Ozlem, Bryl, Anton, Foster, Jennifer orcid:0000-0002-7789-4853 and van Genabith, Josef orcid:0000-0003-1322-7944 (2011) Improving dependency label accuracy using statistical post-editing: A cross-framework study. In: International Conference on Dependency Linguistics (DepLing), 5-7 Sept 2011, Barcelona, Spain. (2011)
|
|
BASE
|
|
Show details
|
|
14 |
An automatically built named entity lexicon for Arabic
|
|
|
|
In: Attia, Mohammed, Toral, Antonio orcid:0000-0003-2357-2960 , Tounsi, Lamia, Monachini, Monica and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) An automatically built named entity lexicon for Arabic. In: LREC 2010 - 7th conference on International Language Resources and Evaluation, 17-23 May 2010, Valletta, Malta. (2010)
|
|
BASE
|
|
Show details
|
|
15 |
Seeding statistical machine translation with translation memory output through tree-based structural alignment
|
|
|
|
In: Zhechev, Ventsislav and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) Seeding statistical machine translation with translation memory output through tree-based structural alignment. In: SSST-4 - 4th Workshop on Syntax and Structure in Statistical Translation, 28 August 2010, Beijing, China. (2010)
|
|
BASE
|
|
Show details
|
|
16 |
Arabic parsing using grammar transforms
|
|
|
|
In: Tounsi, Lamia and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) Arabic parsing using grammar transforms. In: LREC 2010 - 7th conference on International Language Resources and Evaluation, 17-23 May 2010, Valletta, Malta. (2010)
|
|
BASE
|
|
Show details
|
|
17 |
LFG without C-structures
|
|
|
|
In: Cetinoglu, Ozlem, Foster, Jennifer orcid:0000-0002-7789-4853 , Nivre, Joakim, Hogan, Deirdre, Cahill, Aoife orcid:0000-0002-3519-7726 and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) LFG without C-structures. In: the 9th International Workshop on Treebanks and Linguistic Theories, 3 - 4 Dec. 2010, Tartu Estonia. (2010)
|
|
BASE
|
|
Show details
|
|
18 |
Closing the gap between stochastic and rule-based LFG grammars
|
|
|
|
In: Hautli, Annette, Cetinoglu, Ozlem and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) Closing the gap between stochastic and rule-based LFG grammars. In: the LFG10 Conference, 18-20 July 2010, Ottowa, Canada. (2010)
|
|
BASE
|
|
Show details
|
|
19 |
Lemmatization and lexicalized statistical parsing of morphologically rich languages: the case of French
|
|
|
|
In: Seddah, Djamé, Chrupała, Grzegorz, Cetinoglu, Ozlem, van Genabith, Josef orcid:0000-0003-1322-7944 and Candito, Marie (2010) Lemmatization and lexicalized statistical parsing of morphologically rich languages: the case of French. In: SPMRL 2010 - 1st Workshop on Statistical Parsing of Morphologically-Rich Languages at NAACL HLT 2010, 5 June 2010, Los Angeles, CA, USA. (2010)
|
|
BASE
|
|
Show details
|
|
20 |
Deep syntax language models and statistical machine translation
|
|
|
|
In: Graham, Yvette and van Genabith, Josef orcid:0000-0003-1322-7944 (2010) Deep syntax language models and statistical machine translation. In: SSST-4 - 4th Workshop on Syntax and Structure in Statistical Translation at COLING 2010, 28 August 2010, Beijing, China. (2010)
|
|
BASE
|
|
Show details
|
|
|
|