1 |
Text+: Language- and text-based Research Data Infrastructure ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Text+: Language- and text-based Research Data Infrastructure ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Text+: Language- and text-based Research Data Infrastructure ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Komponenten-basierte Metadatenschemata und Facetten-basierte Suche ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Komponenten-basierte Metadatenschemata und Facetten-basierte Suche ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
All that glitters is not gold : a gold standard of adjective-noun collocations for German
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Semantic modelling of adjective-noun collocations using FrameNet
|
|
|
|
BASE
|
|
Show details
|
|
8 |
No Word is an Island—A Transformation Weighting Model for Semantic Composition
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 7, Pp 437-451 (2019) (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Digitale Forschungsinfrastrukturen für die Sprachwissenschaften
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Connecting Resources: Which Issues Have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
|
|
|
|
In: 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17) ; https://hal.archives-ouvertes.fr/hal-01918880 ; 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17), Oct 2017, Bolzano, Italy. pp.52-55 ; https://doi.org/10.5281/zenodo.1040713 (2017)
|
|
BASE
|
|
Show details
|
|
12 |
Connecting Resources: Which Issues Have To Be Solved To Integrate Cmc Corpora From Heterogeneous Sources And For Different Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Connecting Resources: Which Issues Have To Be Solved To Integrate Cmc Corpora From Heterogeneous Sources And For Different Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Integrating optical character recognition and machine translation of historical documents
|
|
|
|
In: Afli, Haithem orcid:0000-0002-7449-4707 and Way, Andy orcid:0000-0001-5736-5930 (2016) Integrating optical character recognition and machine translation of historical documents. In: COLING, the 26th International Conference on Computational Linguistics, 13-16 Dec 2016, Osaka, Japan. (2016)
|
|
Abstract:
Machine Translation (MT) plays a critical role in expanding capacity in the translation industry. However, many valuable documents, including digital documents, are encoded in non-accessible formats for machine processing (e.g., Historical or Legal documents). Such documents must be passed through a process of Optical Character Recognition (OCR) to render the text suitable for MT. No matter how good the OCR is, this process introduces recognition errors, which often renders MT ineffective. In this paper, we propose a new OCR to MT framework based on adding a new OCR error correction module to enhance the overall quality of translation. Experimentation shows that our new system correction based on the combination of Language Modeling and Translation methods outperforms the baseline system by nearly 30% relative improvement.
|
|
Keyword:
Machine translating
|
|
URL: http://doras.dcu.ie/23243/
|
|
BASE
|
|
Hide details
|
|
18 |
Introduction to the Special Issue [Computational, cognitive, and linguistic approaches to the analysis of compounds and collocations]
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Accurate linear-time Chinese word segmentation via embedding matching
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Automatic noun compound interpretation using deep neural networks and word embeddings
|
|
|
|
BASE
|
|
Show details
|
|
|
|