1 |
MasakhaNER: Named entity recognition for African languages
|
|
Adelani, David,; Abbott, Jade; Neubig, Graham; D'Souza, Daniel; Kreutzer, Julia; Lignos, Constantine; Palen-Michel, Chester; Buzaaba, Happy; Rijhwani, Shruti; Ruder, Sebastian; Mayhew, Stephen; Abebe Azime, Israel; Muhammad, Shamsuddeen,; Chinenye Emezue, Chris; Nakatumba-Nabende, Joyce; Ogayo, Perez; Aremu, Anuoluwapo; Gitau, Catherine; Mbaye, Derguene; Alabi, Jesujoba; Yimam, Seid,; Rabiu Gwadabe, Tajuddeen; Ezeani, Ignatius; Niyongabo, Rubungo,; Mukiibi, Jonathan; Otiende, Verrah; Orife, Iroro; David, Davis; Ngom, Samba; Adewumi, Tosin; Rayson, Paul; Adeyemi, Mofetoluwa; Muriuki, Gerald; Anebi, Emmanuel; Chukwuneke, Chiamaka; Odu, Nkiruka; Wairagala, Eric,; Oyerinde, Samuel; Siro, Clemencia; Saul Bateesa, Tobius; Oloyede, Temilola; Wambui, Yvonne; Akinode, Victor; Nabagereka, Deborah; Katusiime, Maurice; Awokoya, Ayodele; Mboup, Mouhamadane; Gebreyohannes, Dibora; Tilaye, Henok; Nwaike, Kelechi; Wolde, Degaga; Faye, Abdoulaye; Sibanda, Blessing; Ahia, Orevaoghene; Dossou, Bonaventure,; Ogueji, Kelechi; Thierno, Ibrahima; DIALLO, Abdoulaye; Akinfaderin, Adewale; Marengereke, Tendai; Osei, Salomey
|
|
In: EISSN: 2307-387X ; Transactions of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03350962 ; Transactions of the Association for Computational Linguistics, The MIT Press, 2021, ⟨10.1162/tacl⟩ (2021)
|
|
Abstract:
International audience ; We take a step towards addressing the underrepresentation of the African continent in NLP research by bringing together different stakeholders to create the first large, publicly available, high-quality dataset for named entity recognition (NER) in ten African languages. We detail the characteristics of these languages to help researchers and practitioners better understand the challenges they pose for NER tasks. We analyze our datasets and conduct an extensive empirical evaluation of stateof-the-art methods across both supervised and transfer learning settings. Finally, we release the data, code, and models to inspire future research on African NLP. 1
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]
|
|
URL: https://hal.inria.fr/hal-03350962 https://doi.org/10.1162/tacl https://hal.inria.fr/hal-03350962/document https://hal.inria.fr/hal-03350962/file/adelani_TACL2021.pdf
|
|
BASE
|
|
Hide details
|
|
2 |
Explorations in Transfer Learning for OCR Post-Correction ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Evaluating the Morphosyntactic Well-formedness of Generated Texts ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Lexically-Aware Semi-Supervised Learning for OCR Post-Correction ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Dependency Induction Through the Lens of Visual Perception ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
AlloVera: a multilingual allophone database
|
|
|
|
In: LREC 2020: 12th Language Resources and Evaluation Conference ; https://halshs.archives-ouvertes.fr/halshs-02527046 ; LREC 2020: 12th Language Resources and Evaluation Conference, European Language Resources Association, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
7 |
AlloVera: a multilingual allophone database
|
|
|
|
In: LREC 2020: 12th Language Resources and Evaluation Conference ; https://halshs.archives-ouvertes.fr/halshs-02527046 ; LREC 2020: 12th Language Resources and Evaluation Conference, European Language Resources Association, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Improving Candidate Generation for Low-resource Cross-lingual Entity Linking
|
|
|
|
In: Transactions of the Association for Computational Linguistics, Vol 8, Pp 109-124 (2020) (2020)
|
|
BASE
|
|
Show details
|
|
|
|