41 |
Normalising orthographic and dialectal variants for the automatic processing of Swiss German
|
|
|
|
In: Proceedings of the 7th Language and Technology Conference (2015)
|
|
Abstract:
Swiss dialects of German are, unlike most dialects of well standardised languages, widely used in everyday communication. Despite this fact, they lack tools and resources for natural language processing. The main reason for this is the fact that the dialects are mostly spoken and that written resources are small and highly inconsistent. This paper addresses the great variability in writing that poses a problem for automatic processing. We propose an automatic approach to normalising the variants to a single representation intended for processing tools' internal use (not shown to human users). We manually create a sample of transcribed and normalised texts, which we use to train and test three methods based on machine translation: word-by-word mappings, character-based machine translation, and language modelling. We show that an optimal combination of the three approaches gives better results than any of them separately.
|
|
Keyword:
info:eu-repo/classification/ddc/410
|
|
URL: https://archive-ouverte.unige.ch/unige:82397
|
|
BASE
|
|
Hide details
|
|
42 |
Crowdsourced mapping of pronunciation variants in European French
|
|
|
|
In: Proceedings of the 18th International Congress of Phonetic Science pp. 1-5 (2015)
|
|
BASE
|
|
Show details
|
|
43 |
Normalising orthographic and dialectal variants for the automatic processing of Swiss German
|
|
|
|
In: Samardžić, Tanja; Scherrer, Yves; Glaser, Elvira (2015). Normalising orthographic and dialectal variants for the automatic processing of Swiss German. In: Language and Technology Conference: Human Language Technologies as a Challenge for Computer Science and Linguistics, Poznan, Poland, 27 November 2015 - 29 November 2015, 294-298. (2015)
|
|
BASE
|
|
Show details
|
|
44 |
A language-independent and fully unsupervised approach to lexicon induction and part-of-speech tagging for closely related languages
|
|
|
|
In: Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-01022298 ; Language Resources and Evaluation Conference, European Language Resources Association, May 2014, Reykjavik, Iceland (2014)
|
|
BASE
|
|
Show details
|
|
45 |
Dialektometrische Analyse von schweizerdeutschen Dialektdaten
|
|
|
|
In: 18. Arbeitstagung zur alemannischen Dialektologie (2014) (2014)
|
|
BASE
|
|
Show details
|
|
46 |
Part-of-speech tagging for regional languages and dialects : A generic approach based on unsupervised learning
|
|
|
|
In: 8èmes Journées Suisses de la Linguistique (2014) (2014)
|
|
BASE
|
|
Show details
|
|
47 |
A language-independent and fully unsupervised approach to lexicon induction and part-of-speech tagging for closely related languages
|
|
|
|
In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014) (2014)
|
|
BASE
|
|
Show details
|
|
48 |
Unsupervised adaptation of supervised part-of-speech taggers for closely related languages
|
|
|
|
In: Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (VarDial) pp. 30-38 (2014)
|
|
BASE
|
|
Show details
|
|
49 |
The distribution of aggregated syntactic construction types compared with other linguistic levels - A dialectometrical analysis of Swiss German dialects
|
|
|
|
In: Methods in Dialectology XV (2014) (2014)
|
|
BASE
|
|
Show details
|
|
50 |
Computerlinguistische Experimente für die schweizerdeutsche Dialektlandschaft: Maschinelle Übersetzung und Dialektometrie
|
|
|
|
In: ISBN: 978-3-515-10343-5 ; Alemannische Dialektologie: Dialekte im Kontakt (Beiträge zur 17. Arbeitstagung für alemannische Dialektologie in Strassburg) pp. 261-278 (2014)
|
|
BASE
|
|
Show details
|
|
51 |
SwissAdmin: a multilingual tagged parallel corpus of press releases
|
|
|
|
In: Proceedings of the Ninth International Conference on Language Resources and Evaluation (LREC 2014) (2014)
|
|
BASE
|
|
Show details
|
|
52 |
Digitizing the linguistic atlas of German-speaking Switzerland
|
|
|
|
In: Methods in Dialectology XV (2014) (2014)
|
|
BASE
|
|
Show details
|
|
54 |
Kurzbericht über die Dialektometrisierung des Gesamtnetzes des „Sprachatlasses der deutschen Schweiz“ (SDS)
|
|
|
|
In: ISBN: 978-3-11-030930-0 ; Vielfalt, Variation und Stellung der deutschen Sprache pp. 153-176 (2013)
|
|
BASE
|
|
Show details
|
|
55 |
Generating Swiss German sentences from Standard German: a multi-dialectal approach ...
|
|
|
|
BASE
|
|
Show details
|
|
56 |
The Trilingual ALLEGRA Corpus: Presentation and Possible Use for Lexicon Induction
|
|
|
|
In: ISBN: 978-2-9517408-7-7 ; Proceedings of the Eight International Conference on Language Resources and Evaluation (LREC'12) pp. 2890-2896 (2012)
|
|
BASE
|
|
Show details
|
|
57 |
Création automatique de dictionnaires bilingues d'entités nommées grâce à Wikipédia
|
|
|
|
In: ISSN: 1661-8246 ; Cahiers de linguistique française, Vol. 30, No 11 (2012) pp. 213-227 (2012)
|
|
BASE
|
|
Show details
|
|
58 |
Machine translation into multiple dialects: The example of Swiss German
|
|
|
|
In: 7th SIDG Congress - Dialect 2.0 (2012) (2012)
|
|
BASE
|
|
Show details
|
|
59 |
Dialäkt Äpp - A smartphone application for Swiss German dialects with great scientific potential
|
|
|
|
In: 7th SIDG Congress - Dialect 2.0 (2012) (2012)
|
|
BASE
|
|
Show details
|
|
60 |
Dialektometrische Experimente mit schweizerdeutschem Dialektmaterial
|
|
|
|
In: Graduiertenkolloquium Linguistik (2012) (2012)
|
|
BASE
|
|
Show details
|
|
|
|