1 |
Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic
|
|
|
|
In: 10th Language Resources and Evaluation Conference (LREC 2016) ; https://hal.archives-ouvertes.fr/hal-01349201 ; 10th Language Resources and Evaluation Conference (LREC 2016), May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
2 |
A Large Scale Corpus of Gulf Arabic
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349204 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
3 |
Exploiting Arabic Diacritization for High Quality Automatic Annotation
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349206 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
4 |
A Semi-automatic and Low Cost Approach to Build Scalable Lemma-based Lexical Resources for Arabic Verbs
|
|
|
|
In: International Journal of Information Technology and Computer Science(IJITCS) ; https://hal.archives-ouvertes.fr/hal-01270974 ; International Journal of Information Technology and Computer Science(IJITCS), IJIT, 2016, 8 (2), pp.1-13 ; http://www.mecs-press.org/ijitcs/ijitcs-v8-n2/IJITCS-V8-N2-1.pdf (2016)
|
|
Abstract:
International audience ; This work presents a method that enables Arabic NLP community to build scalable lexical resources. The proposed method is low cost and efficient in time in addition to its scalability and extendibility. The latter is reflected in the ability for the method to be incremental in both aspects, processing resources and generating lexicons. Using a corpus; firstly, tokens are drawn from the corpus and lemmatized. Secondly, finite state transducers (FSTs) are generated semi-automatically. Finally, FSTsare used to produce all possible inflected verb forms with their full morphological features. Among the algorithm’s strength is its ability to generate transducers having 184 transitions, which is very cumbersome, if manually designed. The second strength is a new inflection scheme of Arabic verbs; this increases the efficiency of FST generation algorithm. The experimentation uses a representative corpus of Modern Standard Arabic. The number of semi-automatically generated transducers is 171. The resulting open lexical resources coverage is high. Our resources cover more than 70% Arabic verbs. The built resources contain 16,855 verb lemmas and 11,080,355 fully, partially and not vocalized verbal inflected forms. All these resources are being made public and currently used as an open package in the Unitex framework available under the LGPL license.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; Arabic linguistic resources; Arabic NLP; Arabic verbs; Finite state transducers; Unitex
|
|
URL: https://hal.archives-ouvertes.fr/hal-01270974
|
|
BASE
|
|
Hide details
|
|
5 |
Adaptation of a Term Extractor to Arabic Specialised Texts: First Experiments and Limits
|
|
|
|
In: International Conference on Intelligent Text Processing and Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01771875 ; International Conference on Intelligent Text Processing and Computational Linguistics, Springer, Jan 2016, Konya, Turkey (2016)
|
|
BASE
|
|
Show details
|
|
6 |
Automatic processing of Tunisian dialect: construction of linguistic resources ; TRAITEMENT AUTOMATIQUE DU DIALECTE TUNISIEN : CONSTRUCTION DE RESSOURCES LINGUISTIQUES
|
|
|
|
In: https://hal.archives-ouvertes.fr/tel-02869866 ; Informatique et langage [cs.CL]. Université de Sfax (Tunisie), 2016. Français (2016)
|
|
BASE
|
|
Show details
|
|
7 |
DALILA: The Dialectal Arabic Linguistic Learning Assistant
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349203 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
8 |
An Algerian dialect: Study and Resources
|
|
|
|
In: ISSN: 2158-107X ; EISSN: 2156-5570 ; International journal of advanced computer science and applications (IJACSA) ; https://hal.archives-ouvertes.fr/hal-01297415 ; International journal of advanced computer science and applications (IJACSA), The Science and Information Organization, 2016, 7 (3), pp.384-396. ⟨10.14569/IJACSA.2016.070353⟩ (2016)
|
|
BASE
|
|
Show details
|
|
9 |
ArabTAG: from a Handcrafted to a Semi-automatically Generated TAG
|
|
|
|
In: TAG+12: 12th International Workshop on Tree-Adjoining Grammars and Related Formalisms ; https://hal.archives-ouvertes.fr/hal-01320995 ; TAG+12: 12th International Workshop on Tree-Adjoining Grammars and Related Formalisms, Jun 2016, Düsseldorf, Germany (2016)
|
|
BASE
|
|
Show details
|
|
10 |
Recognition and TEI annotation of Arabic Events Using Transducers
|
|
|
|
In: 17th International Conference on Intelligent Text Processing and Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-01291336 ; 17th International Conference on Intelligent Text Processing and Computational Linguistics, Apr 2016, Konya, Turkey (2016)
|
|
BASE
|
|
Show details
|
|
|
|