2 |
Machine Translation of Arabic Dialects ...
|
|
|
|
Abstract:
This thesis discusses different approaches to machine translation (MT) from Dialectal Arabic (DA) to English. These approaches handle the varying stages of Arabic dialects in terms of types of available resources and amounts of training data. The overall theme of this work revolves around building dialectal resources and MT systems or enriching existing ones using the currently available resources (dialectal or standard) in order to quickly and cheaply scale to more dialects without the need to spend years and millions of dollars to create such resources for every dialect. Unlike Modern Standard Arabic (MSA), DA-English parallel corpora is scarcely available for few dialects only. Dialects differ from each other and from MSA in orthography, morphology, phonology, and to some lesser degree syntax. This means that combining all available parallel data, from dialects and MSA, to train DA-to-English statistical machine translation (SMT) systems might not provide the desired results. Similarly, translating ...
|
|
Keyword:
Arabic language--Dialects; Arabic language--Machine translating; Artificial intelligence; Computer science
|
|
URL: https://dx.doi.org/10.7916/d8q25h44 https://academiccommons.columbia.edu/doi/10.7916/D8Q25H44
|
|
BASE
|
|
Hide details
|
|
|
|