1 |
Document Domain Randomization for Deep Learning Document Layout Extraction
|
|
|
|
In: Proceedings of the 16th International Conference on Document Analysis and Recognition (ICDAR, September 5--10, Lausanne, Switzerland) ; https://hal.inria.fr/hal-03336444 ; Proceedings of the 16th International Conference on Document Analysis and Recognition (ICDAR, September 5--10, Lausanne, Switzerland), Sep 2021, Lausanne, Switzerland. pp.497-513, ⟨10.1007/978-3-030-86549-8_32⟩ (2021)
|
|
BASE
|
|
Show details
|
|
2 |
Document Domain Randomization for Deep Learning Document Layout Extraction ...
|
|
Ling, Meng; Chen, Jian; Möller, Torsten; Isenberg, Petra; Isenberg, Tobias; Sedlmair, Michael; Laramee, Robert S.; Shen, Han-Wei; Wu, Jian; Giles, C. Lee. - : arXiv, 2021
|
|
Abstract:
We present document domain randomization (DDR), the first successful transfer of convolutional neural networks (CNNs) trained only on graphically rendered pseudo-paper pages to real-world document segmentation. DDR renders pseudo-document pages by modeling randomized textual and non-textual contents of interest, with user-defined layout and font styles to support joint learning of fine-grained classes. We demonstrate competitive results using our DDR approach to extract nine document classes from the benchmark CS-150 and papers published in two domains, namely annual meetings of Association for Computational Linguistics (ACL) and IEEE Visualization (VIS). We compare DDR to conditions of style mismatch, fewer or more noisy samples that are more easily obtained in the real world. We show that high-fidelity semantic information is not necessary to label semantic classes but style mismatch between train and test can lower model accuracy. Using smaller training samples had a slightly detrimental effect. Finally, ... : Main paper to appear in ICDAR 2021 (16th International Conference on Document Analysis and Recognition). This version contains additional materials. The associated test data is hosted on IEEE Data Port: http://doi.org/10.21227/326q-bf39 ...
|
|
Keyword:
Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences; Information Retrieval cs.IR; Machine Learning cs.LG
|
|
URL: https://dx.doi.org/10.48550/arxiv.2105.14931 https://arxiv.org/abs/2105.14931
|
|
BASE
|
|
Hide details
|
|
3 |
A Neural Network-Based Linguistic Similarity Measure for Entrainment in Conversations ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Extractive Research Slide Generation Using Windowed Labeling Ranking ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Additional file 3 of Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Additional file 3 of Could graph neural networks learn better molecular representation for drug discovery? A comparison study of descriptor-based and graph-based models ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Audio-visual Recognition of Overlapped speech for the LRS2 dataset ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Dealing with incomplete information in linguistic group decision making by means of Interval Type‐2 Fuzzy Sets
|
|
|
|
In: ISSN: 0884-8173 ; EISSN: 1098-111X ; International Journal of Intelligent Systems ; https://www.hal.inserm.fr/inserm-03026626 ; International Journal of Intelligent Systems, Wiley, 2019, 34 (6), pp.1261-1280. ⟨10.1002/int.22095⟩ (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Additional file 1: of Prevalence and risk factors of active pulmonary tuberculosis among elderly people in China: a population based cross-sectional study ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Additional file 1: of Prevalence and risk factors of active pulmonary tuberculosis among elderly people in China: a population based cross-sectional study ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Dealing with Incomplete Information in Linguistic Group Decision Making by Means of Interval Type-2 Fuzzy Sets
|
|
|
|
BASE
|
|
Show details
|
|
12 |
An interaction consensus in group decision making under distributed trust information
|
|
|
|
BASE
|
|
Show details
|
|
13 |
A minimum adjustment cost feedback mechanism based consensus model for group decision making under social network with distributed linguistic trust
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Tibetan Trisyllabic Light Verb Construction Recognition
|
|
|
|
In: Zhao, Weina; Li, Lin; Liu, Huidan; & Wu, Jian. (2016). Tibetan Trisyllabic Light Verb Construction Recognition. Himalayan Linguistics, 15(1). doi:10.5070/H915130102. Retrieved from: http://www.escholarship.org/uc/item/2226c4k2 (2016)
|
|
BASE
|
|
Show details
|
|
15 |
A Chinese to Tibetan Machine Translation System with Multiple Translating Strategies
|
|
|
|
In: Liu, Huidan; Zhao, Weina; Yu, Xin; & Wu, Jian. (2016). A Chinese to Tibetan Machine Translation System with Multiple Translating Strategies. Himalayan Linguistics, 15(1). doi:10.5070/H915130103. Retrieved from: http://www.escholarship.org/uc/item/6kz2v0g3 (2016)
|
|
BASE
|
|
Show details
|
|
16 |
Trust Based Consensus Model for Social Network in an Incomplete Linguistic Information Context
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Visual consensus feedback mechanism for group decision making with complementary linguistic preference relations
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Consistency based estimation of fuzzy linguistic preferences. The case of reciprocal intuitionistic fuzzy preference relations
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Automatic Identification of Research Articles from Crawled Documents
|
|
|
|
In: Seventh International Conference on Web Search and Data Mining, February 24-28, 2014, New York City, New York. (2014)
|
|
BASE
|
|
Show details
|
|
|
|