1 |
Toward Creation of Ancash Quechua Lexical Resources from OCR
|
|
|
|
In: Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas ; Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP). Proceedings of the First Workshop ; https://hal.archives-ouvertes.fr/hal-03610330 ; Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP). Proceedings of the First Workshop, The Association for Computational Linguistics, pp.163-167, 2021, 978-1-954085-44-2 ; https://aclanthology.org/2021.americasnlp-1.pdf (2021)
|
|
BASE
|
|
Show details
|
|
2 |
Toward Creation of Ancash Quechua Lexical Resources from OCR
|
|
|
|
In: Proceedings of the First Workshop on Natural Language Processing for Indigenous Languages of the Americas ; Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP). Proceedings of the First Workshop ; https://hal.archives-ouvertes.fr/hal-03610330 ; Natural Language Processing for Indigenous Languages of the Americas (AmericasNLP). Proceedings of the First Workshop, The Association for Computational Linguistics, pp.163-167, 2021, 978-1-954085-44-2 ; https://aclanthology.org/2021.americasnlp-1.pdf (2021)
|
|
Abstract:
International audience ; The Quechua linguistic family has a limited number of NLP resources, most of them being dedicated to Southern Quechua, whereas the varieties of Central Quechua have, to the best of our knowledge, no specific resources (software, lexicon or corpus). Our work addresses this issue by producing two resources for the Ancash Quechua: a full digital version of a dictionary, and an OCR model adapted to the considered variety. In this paper, we describe the steps towards this goal: we first measure performances of existing models for the task of digitising a Quechua dictionary, then adapt a model for the Ancash variety, and finally create a reliable resource for NLP in XML-TEI format. We hope that this work will be a basis for initiating NLP projects for Central Quechua, and that it will encourage digitisation initiatives for under-resourced languages.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-TT]Computer Science [cs]/Document and Text Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics
|
|
URL: https://hal.archives-ouvertes.fr/hal-03610330/file/Towards%20creation%20of%20Ancash%20Quechua%20lexical%20resources%20from%20OCR.pdf https://hal.archives-ouvertes.fr/hal-03610330 https://hal.archives-ouvertes.fr/hal-03610330/document
|
|
BASE
|
|
Hide details
|
|
3 |
Adapting a Pre-Neural Named Entity Recognizer and Linker to Historical Data
|
|
|
|
In: CEUR Workshop Proceedings Conference and Labs of the Evaluation Forum ; https://hal-inalco.archives-ouvertes.fr/hal-03613392 ; CEUR Workshop Proceedings Conference and Labs of the Evaluation Forum, Sep 2020, Thessalonoki, Greece (2020)
|
|
BASE
|
|
Show details
|
|
4 |
NLU-Co at SemEval-2020 Task 5: NLU/SVM based model apply to characterise and extract counterfactual items on raw data
|
|
|
|
In: COLING ; SemEval-2020 (International Workshop on Semantic Evaluation 2020) ; https://hal.archives-ouvertes.fr/hal-03119450 ; SemEval-2020 (International Workshop on Semantic Evaluation 2020), Dec 2020, Barcelone, Spain. pp.670-676 ; https://www.aclweb.org/anthology/2020.semeval-1.87 (2020)
|
|
BASE
|
|
Show details
|
|
5 |
Adapting a Pre-Neural Named Entity Recognizer and Linker to Historical Data
|
|
|
|
In: CEUR Workshop Proceedings Conference and Labs of the Evaluation Forum ; https://hal-inalco.archives-ouvertes.fr/hal-03613392 ; CEUR Workshop Proceedings Conference and Labs of the Evaluation Forum, Sep 2020, Thessalonoki, Greece (2020)
|
|
BASE
|
|
Show details
|
|
6 |
NLU-Co at SemEval-2020 Task 5: NLU/SVM based model apply to characterise and extract counterfactual items on raw data
|
|
|
|
In: COLING ; SemEval-2020 (International Workshop on Semantic Evaluation 2020) ; https://hal.archives-ouvertes.fr/hal-03119450 ; SemEval-2020 (International Workshop on Semantic Evaluation 2020), Dec 2020, Barcelone, Spain. pp.670-676 ; https://www.aclweb.org/anthology/2020.semeval-1.87 (2020)
|
|
BASE
|
|
Show details
|
|
7 |
Vers une ontologie de la nomination et de la référence dédiée à l'annotation des textes
|
|
|
|
In: 13rd Terminology & Ontology: Theories and applications (TOTh) International Conference ; https://hal.archives-ouvertes.fr/hal-02269154 ; 13rd Terminology & Ontology: Theories and applications (TOTh) International Conference, Jun 2019, Chambéry, France (2019)
|
|
BASE
|
|
Show details
|
|
8 |
Analysis and Automatic Processing of Discourse
|
|
|
|
In: Corpus Linguistics (CL2019) ; https://hal.archives-ouvertes.fr/hal-02377077 ; Corpus Linguistics (CL2019), Jul 2019, Cardiff, United Kingdom (2019)
|
|
BASE
|
|
Show details
|
|
9 |
Vers une ontologie de la nomination et de la référence dédiée à l'annotation des textes
|
|
|
|
In: 13rd Terminology & Ontology: Theories and applications (TOTh) International Conference ; https://hal.archives-ouvertes.fr/hal-02269154 ; 13rd Terminology & Ontology: Theories and applications (TOTh) International Conference, Jun 2019, Chambéry, France (2019)
|
|
BASE
|
|
Show details
|
|
10 |
Analysis and Automatic Processing of Discourse
|
|
|
|
In: Corpus Linguistics (CL2019) ; https://hal.archives-ouvertes.fr/hal-02377077 ; Corpus Linguistics (CL2019), Jul 2019, Cardiff, United Kingdom (2019)
|
|
BASE
|
|
Show details
|
|
11 |
A Bambara Tonalization System for Word Sense Disambiguation Using Differential Coding, Segmentation and Edit Operation Filtering
|
|
|
|
In: Proceedings of the The 8th International Joint Conference on Natural Language Processing ; The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017) ; https://hal.archives-ouvertes.fr/hal-01685393 ; The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017), Nov 2017, Taipei, Taiwan. pp.694 - 703 (2017)
|
|
BASE
|
|
Show details
|
|
12 |
Une approche linguistique pour la détection des dialectes arabes
|
|
|
|
In: Actes de TALN 2017 ; 2017-06-26 ; https://hal.archives-ouvertes.fr/hal-02012244 ; 2017-06-26, 2017, Orléans, France (2017)
|
|
BASE
|
|
Show details
|
|
13 |
Detecting Influencial Users in Social Networks: Analysing Graph-Based and Linguistic Perspectives
|
|
|
|
In: IFIP Advances in Information and Communication Technology ; 5th IFIP International Workshop on Artificial Intelligence for Knowledge Management (AI4KM) ; https://hal.inria.fr/hal-02517698 ; 5th IFIP International Workshop on Artificial Intelligence for Knowledge Management (AI4KM), Aug 2017, Melbourne, VIC, Australia. pp.113-131, ⟨10.1007/978-3-030-29904-0_9⟩ (2017)
|
|
BASE
|
|
Show details
|
|
14 |
A Bambara Tonalization System for Word Sense Disambiguation Using Differential Coding, Segmentation and Edit Operation Filtering
|
|
|
|
In: Proceedings of the The 8th International Joint Conference on Natural Language Processing ; The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017) ; https://hal.archives-ouvertes.fr/hal-01685393 ; The 8th International Joint Conference on Natural Language Processing (IJCNLP 2017), Nov 2017, Taipei, Taiwan. pp.694 - 703 (2017)
|
|
BASE
|
|
Show details
|
|
15 |
Named Entity Resources - Overview and Outlook
|
|
|
|
In: Language Resources and Evaluation ; https://hal-inalco.archives-ouvertes.fr/hal-01359441 ; Language Resources and Evaluation, 2016, Portorož, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
16 |
ReadME generation from an OWL ontology describing NLP tools
|
|
|
|
In: Proceedings of the 2nd International Workshop on Natural Language Generation and the Semantic Web (WebNLG), ; Natural Language Generation and the Semantic Web ; https://hal-inalco.archives-ouvertes.fr/hal-01425724 ; Natural Language Generation and the Semantic Web, Sep 2016, Edinburgh, United Kingdom. pp.46 - 49, ⟨10.18653/v1/W16-3509⟩ (2016)
|
|
BASE
|
|
Show details
|
|
17 |
The MultiTal NLP tool infrastructure
|
|
|
|
In: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) ; Language Technology Resources and Tools for Digital Humanities ; https://hal-inalco.archives-ouvertes.fr/hal-01425728 ; Language Technology Resources and Tools for Digital Humanities, Dec 2016, Osaka, Japan. pp.156 - 163 (2016)
|
|
BASE
|
|
Show details
|
|
18 |
Named Entity Resources - Overview and Outlook
|
|
|
|
In: http://infoscience.epfl.ch/record/218493 (2016)
|
|
BASE
|
|
Show details
|
|
19 |
The MultiTal NLP tool infrastructure
|
|
|
|
In: Proceedings of the Workshop on Language Technology Resources and Tools for Digital Humanities (LT4DH) ; Language Technology Resources and Tools for Digital Humanities ; https://hal-inalco.archives-ouvertes.fr/hal-01425728 ; Language Technology Resources and Tools for Digital Humanities, Dec 2016, Osaka, Japan. pp.156 - 163 (2016)
|
|
BASE
|
|
Show details
|
|
20 |
Named Entity Resources - Overview and Outlook
|
|
|
|
In: Language Resources and Evaluation ; https://hal-inalco.archives-ouvertes.fr/hal-01359441 ; Language Resources and Evaluation, 2016, Portorož, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
|
|