41 |
When is Wall a Pared and when a Muro? -- Extracting Rules Governing Lexical Selection ...
|
|
|
|
BASE
|
|
Show details
|
|
42 |
When is Wall a Pared and when a Muro?: Extracting Rules Governing Lexical Selection ...
|
|
|
|
BASE
|
|
Show details
|
|
43 |
Lexically-Aware Semi-Supervised Learning for OCR Post-Correction ...
|
|
|
|
Abstract:
Much of the existing linguistic data in many languages of the world is locked away in non-digitized books and documents. Optical character recognition (OCR) can be used to produce digitized text, and previous work has demonstrated the utility of neural post-correction methods that improve the results of general-purpose OCR systems on recognition of less-well-resourced languages. However, these methods rely on manually curated post-correction data, which are relatively scarce compared to the non-annotated raw images that need to be digitized. In this paper, we present a semi-supervised learning method that makes it possible to utilize these raw images to improve performance, specifically through the use of self-training, a technique where a model is iteratively trained on its own outputs. In addition, to enforce consistency in the recognized vocabulary, we introduce a lexically aware decoding method that augments the neural post-correction model with a count-based language model constructed from the ...
|
|
Keyword:
Computational Linguistics; Machine Learning; Machine Learning and Data Mining; Natural Language Processing
|
|
URL: https://dx.doi.org/10.48448/fycy-h885 https://underline.io/lecture/38192-lexically-aware-semi-supervised-learning-for-ocr-post-correction
|
|
BASE
|
|
Hide details
|
|
44 |
Phrase-level Active Learning for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
45 |
Do Context-Aware Translation Models Pay the Right Attention? ...
|
|
|
|
BASE
|
|
Show details
|
|
46 |
AmericasNLI: Evaluating Zero-shot Natural Language Understanding of Pretrained Multilingual Models in Truly Low-resource Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
47 |
Dependency Induction Through the Lens of Visual Perception ...
|
|
|
|
BASE
|
|
Show details
|
|
48 |
Dependency Induction Through the Lens of Visual Perception ...
|
|
|
|
BASE
|
|
Show details
|
|
49 |
Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas
|
|
In: Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas. Edited by: Mager, Manuel; Oncevay, Arturo; Rios, Annette; Meza Ruiz, Ivan Vladimir; Palmer, Alexis; Neubig, Graham; Kann, Katharina (2021). Online: Association for Computational Linguistics. (2021)
|
|
BASE
|
|
Show details
|
|
51 |
A set of recommendations for assessing human-machine parity in language translation
|
|
|
|
In: Läubli, Samuel orcid:0000-0001-5362-4106 , Castilho, Sheila orcid:0000-0002-8416-6555 , Neubig, Graham, Sennrich, Rico orcid:0000-0002-1438-4741 , Shen, Qinlan and Toral, Antonio orcid:0000-0003-2357-2960 (2020) A set of recommendations for assessing human-machine parity in language translation. Journal of Artificial Intelligence Research, 67 . pp. 653-672. ISSN 1076-9757 (2020)
|
|
BASE
|
|
Show details
|
|
52 |
Speech technology for unwritten languages
|
|
|
|
In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.inria.fr/hal-02480675 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2020, ⟨10.1109/TASLP.2020.2973896⟩ (2020)
|
|
BASE
|
|
Show details
|
|
53 |
AlloVera: a multilingual allophone database
|
|
|
|
In: LREC 2020: 12th Language Resources and Evaluation Conference ; https://halshs.archives-ouvertes.fr/halshs-02527046 ; LREC 2020: 12th Language Resources and Evaluation Conference, European Language Resources Association, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
55 |
Explicit Alignment Objectives for Multilingual Bidirectional Encoders ...
|
|
|
|
BASE
|
|
Show details
|
|
56 |
Balancing Training for Multilingual Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
57 |
Automatic Extraction of Rules Governing Morphological Agreement ...
|
|
|
|
BASE
|
|
Show details
|
|
58 |
A Summary of the First Workshop on Language Technology for Language Documentation and Revitalization ...
|
|
|
|
BASE
|
|
Show details
|
|
59 |
A Set of Recommendations for Assessing Human-Machine Parity in Language Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
60 |
Improving Target-side Lexical Transfer in Multilingual Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|