DE eng

Search in the Catalogues and Directories

Hits 1 – 2 of 2

1
A Semi-automatic and Low Cost Approach to Build Scalable Lemma-based Lexical Resources for Arabic Verbs
In: International Journal of Information Technology and Computer Science(IJITCS) ; https://hal.archives-ouvertes.fr/hal-01270974 ; International Journal of Information Technology and Computer Science(IJITCS), IJIT, 2016, 8 (2), pp.1-13 ; http://www.mecs-press.org/ijitcs/ijitcs-v8-n2/IJITCS-V8-N2-1.pdf (2016)
Abstract: International audience ; This work presents a method that enables Arabic NLP community to build scalable lexical resources. The proposed method is low cost and efficient in time in addition to its scalability and extendibility. The latter is reflected in the ability for the method to be incremental in both aspects, processing resources and generating lexicons. Using a corpus; firstly, tokens are drawn from the corpus and lemmatized. Secondly, finite state transducers (FSTs) are generated semi-automatically. Finally, FSTsare used to produce all possible inflected verb forms with their full morphological features. Among the algorithm’s strength is its ability to generate transducers having 184 transitions, which is very cumbersome, if manually designed. The second strength is a new inflection scheme of Arabic verbs; this increases the efficiency of FST generation algorithm. The experimentation uses a representative corpus of Modern Standard Arabic. The number of semi-automatically generated transducers is 171. The resulting open lexical resources coverage is high. Our resources cover more than 70% Arabic verbs. The built resources contain 16,855 verb lemmas and 11,080,355 fully, partially and not vocalized verbal inflected forms. All these resources are being made public and currently used as an open package in the Unitex framework available under the LGPL license.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Arabic linguistic resources; Arabic NLP; Arabic verbs; Finite state transducers; Unitex
URL: https://hal.archives-ouvertes.fr/hal-01270974
BASE
Hide details
2
Recognition and TEI annotation of Arabic Events Using Transducers
In: 17th International Conference on Intelligent Text Processing and Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01291336 ; 17th International Conference on Intelligent Text Processing and Computational Linguistics, Apr 2016, Konya, Turkey (2016)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern