21 |
Multi-source morphosyntactic tagging for Spoken Rusyn
|
|
|
|
In: Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (2017)
|
|
BASE
|
|
Show details
|
|
22 |
A Quantitative Approach to Swiss German Dialect Syntax
|
|
|
|
In: International Conference on Language Variation in Europe (ICLAVE 9) (2017) (2017)
|
|
BASE
|
|
Show details
|
|
23 |
Lexicon Induction for Spoken Rusyn – Challenges and Results
|
|
|
|
In: Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing (2017)
|
|
BASE
|
|
Show details
|
|
24 |
Findings of the VarDial Evaluation Campaign 2017
|
|
|
|
In: Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (2017)
|
|
BASE
|
|
Show details
|
|
25 |
Towards automatic geolocalisation of speakers of European French
|
|
|
|
In: International Conference on Language Variation in Europe (ICLAVE 9) (2017) (2017)
|
|
BASE
|
|
Show details
|
|
26 |
Combien d'accents en français? Focus sur la France, la Belgique et la Suisse
|
|
|
|
In: Processus de différenciation: des pratiques langagières à leur interprétation sociale - Actes du colloque VALS-ASLA 2016, Vol. 1 (2017)
|
|
BASE
|
|
Show details
|
|
27 |
Multi-source morphosyntactic tagging for spoken Rusyn
|
|
|
|
In: Proceedings of the Fourth Workshop on NLP for Similar Languages, Varieties and Dialects. - (2017) , 84 - 92 (2017)
|
|
BASE
|
|
Show details
|
|
28 |
Lexicon Induction for Spoken Rusyn – Challenges and Results
|
|
|
|
In: Proceedings of the 6th Workshop on Balto-Slavic Natural Language Processing. - (2017) , 27 - 32 (2017)
|
|
BASE
|
|
Show details
|
|
29 |
Cartopho : un site web de cartographie de variantes de prononciation en français
|
|
|
|
In: Journées d'Études sur la Parole (JEP 2016) ; https://hal.archives-ouvertes.fr/hal-01621840 ; Journées d'Études sur la Parole (JEP 2016), 2016, Paris, Unknown Region. pp.119-127 (2016)
|
|
BASE
|
|
Show details
|
|
32 |
A quantitative approach to Swiss German - Dialectometric analyses and comparisons of linguistic levels ...
|
|
|
|
BASE
|
|
Show details
|
|
33 |
Schweizerdeutsche Dialekte quantitativ – Dialektometrische Analysen und Vergleich linguistischer Ebenen
|
|
|
|
In: 13. Bayerisch-Österreichische Dialektologentagung (BÖDT) (2016) (2016)
|
|
BASE
|
|
Show details
|
|
34 |
Modernising historical Slovene words
|
|
|
|
In: ISSN: 1351-3249 ; Natural Language Engineering, Vol. 22, No 6 (2016) pp. 881-905 (2016)
|
|
Abstract:
We propose a language-independent word normalisation method and exemplify it on modernising historical Slovene words. Our method relies on character-level statistical machine translation (CSMT) and uses only shallow knowledge. We present relevant data on historical Slovene, consisting of two (partially) manually annotated corpora and the lexicons derived from these corpora, containing historical word–modern word pairs. The two lexicons are disjoint, with one serving as the training set containing 40,000 entries, and the other as a test set with 20,000 entries. The data spans the years 1750–1900, and the lexicons are split into fifty-year slices, with all the experiments carried out separately on the three time periods. We perform two sets of experiments. In the first one – a supervised setting – we build a CSMT system using the lexicon of word pairs as training data. In the second one – an unsupervised setting – we simulate a scenario in which word pairs are not available. We propose a two-step method where we first extract a noisy list of word pairs by matching historical words with cognate modern words, and then train a CSMT system on these pairs. In both sets of experiments, we also optionally make use of a lexicon of modern words to filter the modernisation hypotheses. While we show that both methods produce significantly better results than the baselines, their accuracy and which method works best strongly correlates with the age of the texts, meaning that the choice of the best method will depend on the properties of the historical language which is to be modernised. As an extrinsic evaluation, we also compare the quality of part-of-speech tagging and lemmatisation directly on historical text and on its modernised words. We show that, depending on the age of the text, annotation on modernised words also produces significantly better results than annotation on the original text.
|
|
Keyword:
info:eu-repo/classification/ddc/410
|
|
URL: https://archive-ouverte.unige.ch/unige:82305
|
|
BASE
|
|
Hide details
|
|
35 |
Normalizing orthographic and dialectal variants in the ArchiMob corpus of spoken Swiss German
|
|
|
|
In: 6th Days of Swiss Linguistics (2016) (2016)
|
|
BASE
|
|
Show details
|
|
36 |
Automatic normalisation of the Swiss German ArchiMob corpus using character-level machine translation
|
|
|
|
In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS) (2016)
|
|
BASE
|
|
Show details
|
|
37 |
On-line Multilingual Linguistic Services
|
|
|
|
In: ISBN: 978-4-87974-703-7 ; Proceedings of COLING 2016 System Demonstrations (2016)
|
|
BASE
|
|
Show details
|
|
38 |
ArchiMob - A Corpus of Spoken Swiss German
|
|
|
|
In: Proceedings of the Tenth International Conference on Language Resources and Evaluation (LREC 2016) (2016)
|
|
BASE
|
|
Show details
|
|
39 |
A quantitative approach to Swiss German - Dialectometric analyses and comparisons of linguistic levels
|
|
|
|
In: Scherrer, Yves; Stoeckle, Philipp (2016). A quantitative approach to Swiss German - Dialectometric analyses and comparisons of linguistic levels. Dialectologia et Geolinguistica, 24(1):92-125. (2016)
|
|
BASE
|
|
Show details
|
|
40 |
Normalising orthographic and dialectal variants for the automatic processing of Swiss German ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|