5 |
Morphologically Annotated Corpora and Morphological Analyzers for Moroccan and Sanaani Yemeni Arabic
|
|
|
|
In: 10th Language Resources and Evaluation Conference (LREC 2016) ; https://hal.archives-ouvertes.fr/hal-01349201 ; 10th Language Resources and Evaluation Conference (LREC 2016), May 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
6 |
A Large Scale Corpus of Gulf Arabic
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349204 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
7 |
Exploiting Arabic Diacritization for High Quality Automatic Annotation
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.archives-ouvertes.fr/hal-01349206 ; Language Resources and Evaluation Conference, 2016, Portoroz, Slovenia (2016)
|
|
BASE
|
|
Show details
|
|
9 |
Linguistic Landscape in the School Setting: the Case of the Druze in Israel
|
|
|
|
In: Faculty Contributions to Books (2016)
|
|
BASE
|
|
Show details
|
|
10 |
Attending to face in faceless computer-mediated communication : (im)politeness in online disagreements among Arabic speakers
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Invitation in Saudi Arabic : a socio-pragmatic analysis ; Title on signature form: Invitation in Saudi culture : socio-pragmatic analysis.
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Romanized Arabic and Berber detection using prediction by partial matching and dictionary methods
|
|
|
|
In: 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA) ; https://hal-cea.archives-ouvertes.fr/cea-01841162 ; 2016 IEEE/ACS 13th International Conference of Computer Systems and Applications (AICCSA), Nov 2016, Agadir, Morocco. ⟨10.1109/AICCSA.2016.7945668⟩ (2016)
|
|
BASE
|
|
Show details
|
|
13 |
РАЗРАБОТКА СИСТЕМЫ АВТОМАТИЧЕСКОГО РАСПОЗНАВАНИЯ РЕЧИ ДЛЯ ЕГИПЕТСКОГО ДИАЛЕКТА АРАБСКОГО ЯЗЫКА В ТЕЛЕФОННОМ КАНАЛЕ
|
|
РОМАНЕНКО А.Н.. - : Федеральное государственное автономное образовательное учреждение высшего образования «Санкт-Петербургский национальный исследовательский университет информационных технологий, механики и оптики», 2016
|
|
BASE
|
|
Show details
|
|
14 |
Общие вопросы лексико-семантических преобразований в глагольных фразеологизмах арабского языка
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Some considerations in regard to the evolution of meanings in reference to the word «Palita/fa Tila»
|
|
|
|
BASE
|
|
Show details
|
|
16 |
К ВОПРОСУ О ФУНКЦИОНИРОВАНИИ СОВРЕМЕННОЙ АРАБСКОЙ ТЕРМИНОЛОГИИ
|
|
ЭЛЬ САБРУТИ РАШИДА РАХИМОВНА. - : Федеральное государственное бюджетное образовательное учреждение высшего профессионального образования «Национальный исследовательский Томский государственный университет», 2016
|
|
BASE
|
|
Show details
|
|
18 |
Difference and Dissidence: French, Arabic and Cultural Conflict in Lebanon, 1943-1975
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Pivot-based Statistical Machine Translation for Morphologically Rich Languages
|
|
|
|
Abstract:
This thesis describes the research efforts on pivot-based statistical machine translation (SMT) for morphologically rich languages (MRL). We provide a framework to translate to and from morphologically rich languages especially in the context of having little or no parallel corpora between the source and the target languages. We basically address three main challenges. The first one is the sparsity of data as a result of morphological richness. The second one is maximizing the precision and recall of the pivoting process itself. And the last one is making use of any parallel data between the source and the target languages. To address the challenge of data sparsity, we explored a space of tokenization schemes and normalization options. We also examined a set of six detokenization techniques to evaluate detokenized and orthographically corrected (enriched) output. We provide a recipe of the best settings to translate to one of the most challenging languages, namely Arabic. Our best model improves the translation quality over the baseline by 1.3 BLEU points. We also investigated the idea of separation between translation and morphology generation. We compared three methods of modeling morphological features. Features can be modeled as part of the core translation. Alternatively these features can be generated using target monolingual context. Finally, the features can be predicted using both source and target information. In our experimental results, we outperform the vanilla factored translation model. In order to decide on which features to translate, generate or predict, a detailed error analysis should be provided on the system output. As a result, we present AMEANA, an open-source tool for error analysis of natural language processing tasks, targeting morphologically rich languages. The second challenge we are concerned with is the pivoting process itself. We discuss several techniques to improve the precision and recall of the pivot matching. One technique to improve the recall works on the level of the word alignment as an optimization process for pivoting driven by generating phrase pairs between source and target languages. Despite the fact that improving the recall of the pivot matching improves the overall translation quality, we also need to increase the precision of the pivot quality. To achieve this, we introduce quality constraints scores to determine the quality of the pivot phrase pairs between source and target languages. We show positive results for different language pairs which shows the consistency of our approaches. In one of our best models we reach an improvement of 1.2 BLEU points. The third challenge we are concerned with is how to make use of any parallel data between the source and the target languages. We build on the approach of improving the precision of the pivoting process and the methods of combination between the pivot system and the direct system built from the parallel data. In one of the approaches, we introduce morphology constraint scores which are added to the log linear space of features in order to determine the quality of the pivot phrase pairs. We compare two methods of generating the morphology constraints. One method is based on hand-crafted rules relying on our knowledge of the source and target languages; while in the other method, the morphology constraints are induced from available parallel data between the source and target languages which we also use to build a direct translation model. We then combine both the pivot and direct models to achieve better coverage and overall translation quality. Using induced morphology constraints outperformed the handcrafted rules and improved over our best model from all previous approaches by 0.6 BLEU points (7.2/6.7 BLEU points from the direct and pivot baselines respectively). Finally, we introduce applying smart techniques to combine pivot and direct models. We show that smart selective combination can lead to a large reduction of the pivot model without affecting the performance and in some cases improving it.
|
|
Keyword:
Arabic language; Comparative and general--Morphology; Computer science; Grammar; Machine translating; Natural language processing (Computer science)
|
|
URL: https://doi.org/10.7916/D84J0DZ9
|
|
BASE
|
|
Hide details
|
|
20 |
Teaching non-verbal communication by the authentic video in class of FFL in Libya ; Enseignement de la communication non verbale par la vidéo authentique en classe de FLE en Libye
|
|
|
|
In: https://hal.univ-lorraine.fr/tel-02019379 ; Linguistique. Université de Lorraine, 2016. Français. ⟨NNT : 2016LORR0249⟩ (2016)
|
|
BASE
|
|
Show details
|
|
|
|