41 |
Revue Ouverte d’Intelligence Artificielle
|
|
|
|
In: Revue Ouverte d'Intelligence Artificielle ; https://hal.archives-ouvertes.fr/hal-02933273 ; Revue Ouverte d'Intelligence Artificielle, Association pour la diffusion de la recherche francophone en intelligence artificielle, 2020, Revue Ouverture d'Intelligence Artificielle, 1 (1), pp.43-70 ; https://roia.centre-mersenne.org/ (2020)
|
|
BASE
|
|
Show details
|
|
42 |
Dataset for Temporal Analysis of English-French Cognates
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; 12th Conference on Language Resources and Evaluation (LREC 2020) ; https://hal.archives-ouvertes.fr/hal-03026957 ; 12th Conference on Language Resources and Evaluation (LREC 2020), May 2020, Marseille, France. pp.855-859, ⟨10.5281/zenodo.3693650⟩ (2020)
|
|
BASE
|
|
Show details
|
|
43 |
Using skeleton and Hough transform variant to correct skew in historical documents
|
|
|
|
In: ISSN: 0378-4754 ; Mathematics and Computers in Simulation ; https://hal-univ-bourgogne.archives-ouvertes.fr/hal-02447748 ; Mathematics and Computers in Simulation, Elsevier, 2020, 167, pp.389-403. ⟨10.1016/j.matcom.2019.05.009⟩ (2020)
|
|
BASE
|
|
Show details
|
|
44 |
An Ontology of Chinese Ceramic Vases
|
|
|
|
In: 12th International Conference on Knowledge Engineering and Ontology Development ; https://hal.archives-ouvertes.fr/hal-03134730 ; 12th International Conference on Knowledge Engineering and Ontology Development, Nov 2020, Budapest, France. pp.53-63, ⟨10.5220/0010110600530063⟩ (2020)
|
|
BASE
|
|
Show details
|
|
45 |
Using skeleton and Hough transform variant to correct skew in historical documents.
|
|
|
|
In: ISSN: 0378-4754 ; Mathematics and Computers in Simulation ; https://hal.archives-ouvertes.fr/hal-02441469 ; Mathematics and Computers in Simulation, Elsevier, 2020, 167, pp.389-403. ⟨10.1016/j.matcom.2019.05.009⟩ (2020)
|
|
BASE
|
|
Show details
|
|
46 |
A Dataset for Multi-lingual Epidemiological Event Extraction
|
|
|
|
In: Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020) ; https://hal.archives-ouvertes.fr/hal-02732848 ; Proceedings of the 12th Conference on Language Resources and Evaluation (LREC 2020), May 2020, Marseille, France. pp.4139-4144 (2020)
|
|
BASE
|
|
Show details
|
|
47 |
Impact Analysis of Document Digitization on Event Extraction
|
|
|
|
In: CEUR Workshop Proceedings ; 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020) ; https://hal.archives-ouvertes.fr/hal-03026148 ; 4th Workshop on Natural Language for Artificial Intelligence (NL4AI 2020) co-located with the 19th International Conference of the Italian Association for Artificial Intelligence (AI*IA 2020), Nov 2020, Virtual, Italy. pp.17-28 ; http://sag.art.uniroma2.it/NL4AI/ (2020)
|
|
BASE
|
|
Show details
|
|
48 |
Entity Linking for Historical Documents: Challenges and Solutions
|
|
Pontes, Elvys Linhares; Cabrera-Diego, Luis Adrián; Moreno, José G.; Boros, Emanuela; Pontes, Elvys,; Hamdi, Ahmed; Sidère, Nicolas; Coustaty, Mickaël; Doucet, Antoine
|
|
In: 22nd International Conference on Asia-Pacific Digital Libraries, ICADL 2020 ; https://hal.archives-ouvertes.fr/hal-03034492 ; 22nd International Conference on Asia-Pacific Digital Libraries, ICADL 2020, 12504, Springer, pp.215-231, 2020, Lecture Notes in Computer Science, 978-3-030-64452-9. ⟨10.1007/978-3-030-64452-9_19⟩ (2020)
|
|
Abstract:
International audience ; Named entities (NEs) are among the most relevant type of information that can be used to efficiently index and retrieve digital documents. Furthermore, the use of Entity Linking (EL) to disambiguate and relate NEs to knowledge bases, provides supplementary information which can be useful to differentiate ambiguous elements such as geographical locations and peoples' names. In historical documents, the detection and disambiguation of NEs is a challenge. Most historical documents are converted into plain text using an optical character recognition (OCR) system at the expense of some noise. Documents in digital libraries will, therefore, be indexed with errors that may hinder their accessibility. OCR errors affect not only document indexing but the detection, disambiguation, and linking of NEs. This paper aims at analysing the performance of different EL approaches on two multilingual historical corpora, CLEF HIPE 2020 (English, French, German) and NewsEye (Finnish, French, German, Swedish), while proposes several techniques for alleviating the impact of historical data problems on the EL task. Our findings indicate that the proposed approaches not only outperform the baseline in both corpora but additionally they considerably reduce the impact of historical document issues on different subjects and languages.
|
|
Keyword:
[INFO.INFO-AI]Computer Science [cs]/Artificial Intelligence [cs.AI]; [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-DL]Computer Science [cs]/Digital Libraries [cs.DL]; [INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC]; [INFO.INFO-IR]Computer Science [cs]/Information Retrieval [cs.IR]; [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; Deep learning; Digital libraries; Entity linking; Historical data
|
|
URL: https://doi.org/10.1007/978-3-030-64452-9_19 https://hal.archives-ouvertes.fr/hal-03034492/document https://hal.archives-ouvertes.fr/hal-03034492/file/ICADL_2020___12_14_pages___references.pdf https://hal.archives-ouvertes.fr/hal-03034492
|
|
BASE
|
|
Hide details
|
|
49 |
Robust Named Entity Recognition and Linking on Historical Multilingual Documents
|
|
|
|
In: Conference and Labs of the Evaluation Forum (CLEF 2020) ; https://hal.archives-ouvertes.fr/hal-03026969 ; Conference and Labs of the Evaluation Forum (CLEF 2020), Sep 2020, Thessaloniki, Greece. pp.1-17, ⟨10.5281/zenodo.4068074⟩ ; http://ceur-ws.org/Vol-2696/paper_171.pdf (2020)
|
|
BASE
|
|
Show details
|
|
50 |
Linking Named Entities across Languages using Multilingual Word Embeddings
|
|
|
|
In: JCDL '20: The ACM/IEEE Joint Conference on Digital Libraries in 2020 ; ACM/IEEE Joint Conference on Digital Libraries - JCDL 2020 ; https://hal.archives-ouvertes.fr/hal-03026933 ; ACM/IEEE Joint Conference on Digital Libraries - JCDL 2020, Aug 2020, Wuhan, Hubei - Virtual event, China. pp.329-332, ⟨10.1145/3383583.3398597⟩ ; https://dl.acm.org/doi/10.1145/3383583.3398597 (2020)
|
|
BASE
|
|
Show details
|
|
51 |
Keyphrase Generation for Scientific Document Retrieval
|
|
|
|
In: The 58th Annual Meeting of the Association for Computational Linguistics (ACL) ; https://hal.archives-ouvertes.fr/hal-02556086 ; The 58th Annual Meeting of the Association for Computational Linguistics (ACL), Jul 2020, Online, United States. ⟨10.18653/v1/2020.acl-main.105⟩ (2020)
|
|
BASE
|
|
Show details
|
|
52 |
Concevoir un dispositif innovant pour professionaliser la formation au référencement web: le projet SEO-ELP
|
|
|
|
In: ACFAS ; https://hal.archives-ouvertes.fr/hal-03506079 ; ACFAS, 2020, Sherbrooke, Canada (2020)
|
|
BASE
|
|
Show details
|
|
53 |
Independent publishers and social networks in the 21st century: the balance of power in the transatlantic Spanish-language book market ...
|
|
|
|
BASE
|
|
Show details
|
|
54 |
The European Language Technology Landscape in 2020: Language-Centric and Human-Centric AI for Cross-Cultural Communication in Multilingual Europe ...
|
|
|
|
BASE
|
|
Show details
|
|
55 |
Exhaustive Entity Recognition for Coptic: Challenges and Solutions ...
|
|
|
|
BASE
|
|
Show details
|
|
56 |
The citation impact of social sciences and humanities upon patentable technology ...
|
|
|
|
BASE
|
|
Show details
|
|
57 |
Analyzing the relationship between text features and research proposal productivity ...
|
|
|
|
BASE
|
|
Show details
|
|
58 |
The STEM-ECR Dataset: Grounding Scientific Entity References in STEM Scholarly Content to Authoritative Encyclopedic and Lexicographic Sources ...
|
|
|
|
BASE
|
|
Show details
|
|
59 |
Determining crucial factors for the popularity of scientific articles ...
|
|
|
|
BASE
|
|
Show details
|
|
60 |
A Corpus of Adpositional Supersenses for Mandarin Chinese ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|