1 |
Machine translation of user-generated content
|
|
Lohar, Pintu. - : Dublin City University. School of Computing, 2020. : Dublin City University. ADAPT, 2020
|
|
In: Lohar, Pintu (2020) Machine translation of user-generated content. PhD thesis, Dublin City University. (2020)
|
|
BASE
|
|
Show details
|
|
2 |
FooTweets: a bilingual parallel corpus of World Cup tweets
|
|
|
|
In: Sluyter-Gäthje, Henny, Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2018) FooTweets: a bilingual parallel corpus of World Cup tweets. In: LREC 2018 - 11th International Conference on Language Resources and Evaluation, 7-12 May 2018, Miyazaki, Japan. ISBN 979-10-95546-00-9 (2018)
|
|
BASE
|
|
Show details
|
|
3 |
Balancing translation quality and sentiment preservation
|
|
|
|
In: Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2018) Balancing translation quality and sentiment preservation. In: AMTA 2018, 17-21 Mar 2018, Boston, MA. USA. (2018)
|
|
BASE
|
|
Show details
|
|
4 |
Sentiment translation for low resourced languages: experiments on Irish general election Tweets
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Maguire, Sorcha and Way, Andy orcid:0000-0001-5736-5930 (2017) Sentiment translation for low resourced languages: experiments on Irish general election Tweets. In: 18th International Conference on Computational Linguistics and Intelligent Text Processing, 17-21 Apr 2017, Budapest, Hungry. (2017)
|
|
BASE
|
|
Show details
|
|
5 |
MultiNews: a web collection of an aligned multimodal and multilingual corpus
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Lohar, Pintu and Way, Andy orcid:0000-0001-5736-5930 (2017) MultiNews: a web collection of an aligned multimodal and multilingual corpus. In: Workshop on Curation and Applications of Parallel and Comparable Corpora, 27 Nov- 1 Dec 2017, Taipei, Taiwan. ISBN 978-1-948087-05-6 (2017)
|
|
BASE
|
|
Show details
|
|
6 |
Maintaining sentiment polarity in translation of user-generated content
|
|
|
|
In: Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2017) Maintaining sentiment polarity in translation of user-generated content. Prague Bulletin of Mathematical Linguistics (108). pp. 73-84. ISSN 1804-0462 (2017)
|
|
BASE
|
|
Show details
|
|
7 |
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
|
|
|
|
In: Khwileh, Ahmad, Afli, Haithem orcid:0000-0002-7449-4707 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Way, Andy orcid:0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Third Arabic Natural Language Processing Workshop (WANLP), 3 Apr 2017, Valencia, Spain. (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Dublin City University participation in the VTT track at TRECVid 2017
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Hu, Feiyan orcid:0000-0001-7451-6438 , Du, Jinhua orcid:0000-0002-3267-4881 , Cosgrove, Daniel, McGuinness, Kevin orcid:0000-0003-1336-6477 , O'Connor, Noel E. orcid:0000-0002-4033-9135 , Arazo Sánchez, Eric, Zhou, Jiang orcid:0000-0002-3067-8512 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2017) Dublin City University participation in the VTT track at TRECVid 2017. In: TRECVid workshop, 13-15 Nov 2017, Gaithersburg, Md., USA. (2017)
|
|
BASE
|
|
Show details
|
|
9 |
Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search
|
|
|
|
In: Khwileh, Ahmad, Afli, Haithem orcid:0000-0002-7449-4707 , Jones, Gareth J.F. orcid:0000-0003-2923-8365 and Way, Andy orcid:0000-0001-5736-5930 (2017) Identifying effective translations for cross-lingual Arabic-to-English user-generated speech search. In: Proceedings of The Third Arabic Natural Language Processing Workshop (WANLP), 3-4 Apr 2017, Valencia, Spain. (2017)
|
|
BASE
|
|
Show details
|
|
10 |
Maintaining Sentiment Polarity in Translation of User-Generated Content
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 73-84 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
11 |
The ADAPT bilingual document alignment system at WMT16
|
|
|
|
In: Lohar, Pintu, Afli, Haithem orcid:0000-0002-7449-4707 , Liu, Chao-Hong orcid:0000-0002-1235-6026 and Way, Andy orcid:0000-0001-5736-5930 (2016) The ADAPT bilingual document alignment system at WMT16. In: First Conference on Machine Translation (WMT16), 11-12 Aug 2016, Berlin, Germany. (2016)
|
|
BASE
|
|
Show details
|
|
12 |
FaDA: fast document aligner using word embedding
|
|
|
|
In: Lohar, Pintu, Ganguly, Debasis orcid:0000-0003-0050-7138 , Afli, Haithem orcid:0000-0002-7449-4707 , Way, Andy orcid:0000-0001-5736-5930 and Jones, Gareth J.F. orcid:0000-0003-2923-8365 (2016) FaDA: fast document aligner using word embedding. Prague Bulletin of Mathematical Linguistics (106). pp. 169-179. ISSN 1804-0462 (2016)
|
|
BASE
|
|
Show details
|
|
13 |
Using SMT for OCR error correction of historical texts
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Qui, Zhengwei, Way, Andy orcid:0000-0001-5736-5930 and Sheridan, Páraic (2016) Using SMT for OCR error correction of historical texts. In: Tenth International Conference on Language Resources and Evaluation (LREC 2016), 23-28 May 2016, Portorož, Slovenia. ISBN 978-2-9517408-9-1 (2016)
|
|
Abstract:
A trend to digitize historical paper-based archives has emerged in recent years, with the advent of digital optical scanners. A lot of paper-based books, textbooks, magazines, articles, and documents are being transformed into electronic versions that can be manipulated by a computer. For this purpose, Optical Character Recognition (OCR) systems have been developed to transform scanned digital text into editable computer text. However, different kinds of errors in the OCR system output text can be found, but Automatic Error Correction tools can help in performing the quality of electronic texts by cleaning and removing noises. In this paper, we perform a qualitative and quantitative comparison of several error-correction techniques for historical French documents. Experimentation shows that our Machine Translation for Error Correction method is superior to other Language Modelling correction techniques, with nearly 13% relative improvement compared to the initial baseline.
|
|
Keyword:
Language Modelling; Machine translating; Optical Character Recognition; SpeechToSpeech Translation
|
|
URL: http://doras.dcu.ie/23226/
|
|
BASE
|
|
Hide details
|
|
14 |
Integrating optical character recognition and machine translation of historical documents
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2016) Integrating optical character recognition and machine translation of historical documents. In: COLING, the 26th International Conference on Computational Linguistics, 13-16 Dec 2016, Osaka, Japan. (2016)
|
|
BASE
|
|
Show details
|
|
15 |
From Arabic user-generated content to machine translation: integrating automatic error correction
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 , Aransa, Walid, Lohar, Pintu and Way, Andy orcid:0000-0001-5736-5930 (2016) From Arabic user-generated content to machine translation: integrating automatic error correction. In: 17th International Conference on Intelligent Text Processing and Computational Linguistics, 3–9 Apr 2016, Konya, Turkey. (2016)
|
|
BASE
|
|
Show details
|
|
16 |
Dublin City University and partners’ participation in the INS and VTT tracks at TRECVid 2016
|
|
|
|
In: Marsden, Mark, Mohedano, Eva, McGuinness, Kevin orcid:0000-0003-1336-6477 , Calafell, Andrea, Giró-i-Nieto, Xavier orcid:0000-0002-9935-5332 , O'Connor, Noel E. orcid:0000-0002-4033-9135 , Zhou, Jiang orcid:0000-0002-3067-8512 , Azevedo, Lucas, Daudert, Tobias, Davis, Brian, Hurlimann, Manuela, Afli, Haithem orcid:0000-0002-7449-4707 , Du, Jinhua, Ganguly, Debasis orcid:0000-0003-0050-7138 , Li, Wei B. orcid:0000-0001-7347-3501 , Way, Andy orcid:0000-0001-5736-5930 and Smeaton, Alan F. orcid:0000-0003-1028-8389 (2016) Dublin City University and partners’ participation in the INS and VTT tracks at TRECVid 2016. In: TRECVid Conference, 14-16 Nov 2016, Gaithersburg, Md., USA. (2016)
|
|
BASE
|
|
Show details
|
|
17 |
Dublin City University and Partners' participation in the INS and VTT Tracks at TRECVid 2016
|
|
|
|
BASE
|
|
Show details
|
|
18 |
FaDA: Fast Document Aligner using Word Embedding
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 106, Iss 1, Pp 169-179 (2016) (2016)
|
|
BASE
|
|
Show details
|
|
19 |
OCR Error Correction Using Statistical Machine Translation
|
|
|
|
In: 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2015). ; https://hal.archives-ouvertes.fr/hal-01433200 ; 16th International Conference on Intelligent Text Processing and Computational Linguistics (CICLing 2015)., 2015, Cairo, Egypt (2015)
|
|
BASE
|
|
Show details
|
|
|
|