DE eng

Search in the Catalogues and Directories

Hits 1 – 11 of 11

1
Borderlands of text mapping: Experiments on Fontane's Brandenburg
In: Workshop INF-DH-2018 (Informatik und die Digital Humanities) ; https://hal.archives-ouvertes.fr/hal-01951880 ; Workshop INF-DH-2018 (Informatik und die Digital Humanities), Sep 2018, Berlin, Germany. ⟨10.18420/infdh2018-05⟩ (2018)
BASE
Show details
2
Data-Driven Identification of German Phrasal Compounds
In: Text, Speech, and Dialogue ; https://hal.archives-ouvertes.fr/hal-01575651 ; Kamil Ekštein; Václav Matoušek. Text, Speech, and Dialogue, 10415, Springer International Publishing, pp.192-200, 2017, Lecture Notes in Computer Science, 978-3-319-64205-5. ⟨10.1007/978-3-319-64206-2_22⟩ ; https://link.springer.com/bookseries/558 (2017)
Abstract: Proceedings of the 20th International Conference, TSD 2017, Prague, Czech Republic, August 27-31, 2017 ; International audience ; We present a method to identify and document a phenomenon on which there is very little empirical data: German phrasal compounds occurring in the form of as a single token (without punctuation between their components). Relying on linguistic criteria, our approach implies to have an operational notion of compounds which can be systematically applied as well as (web) corpora which are large and diverse enough to contain rarely seen phenomena. The method is based on word segmentation and morphological analysis, it takes advantage of a data-driven learning process. Our results show that coarse-grained identification of phrasal compounds is best performed with empirical data, whereas fine-grained detection could be improved with a combination of rule-based and frequency-based word lists. Along with the characteristics of web texts, the or-thographic realizations seem to be linked to the degree of expressivity.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.1: Content Analysis and Indexing/H.3.1.1: Dictionaries; ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.1: Content Analysis and Indexing/H.3.1.3: Linguistic processing; ACM: H.: Information Systems/H.3: INFORMATION STORAGE AND RETRIEVAL/H.3.7: Digital Libraries/H.3.7.0: Collection; corpus linguistics; morphological analysis; web corpora; word segmentation
URL: https://hal.archives-ouvertes.fr/hal-01575651
https://hal.archives-ouvertes.fr/hal-01575651/document
https://hal.archives-ouvertes.fr/hal-01575651/file/Barbaresi%26Hein_2017_Data-driven-Identification-of-German-Phrase-Compounds.pdf
https://doi.org/10.1007/978-3-319-64206-2_22
BASE
Hide details
3
Discriminating between Similar Languages using Weighted Subword Features
In: Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017) ; https://hal.archives-ouvertes.fr/hal-01575656 ; Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017), Association for Computational Linguistics (ACL), Apr 2017, Valence, Spain. pp.184-189, ⟨10.18653/v1/W17-1223⟩ ; http://ttg.uni-saarland.de/vardial2017/ (2017)
BASE
Show details
4
Bootstrapped OCR error detection for a less-resourced language variant
In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS 2016) ; 13th Conference on Natural Language Processing (KONVENS 2016) ; https://hal.archives-ouvertes.fr/hal-01371689 ; 13th Conference on Natural Language Processing (KONVENS 2016), Sep 2016, Bochum, Germany. pp.21-26 ; https://www.linguistics.ruhr-uni-bochum.de/konvens16/ (2016)
BASE
Show details
5
An Unsupervised Morphological Criterion for Discriminating Similar Languages
In: 3rd Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2016) ; https://hal.archives-ouvertes.fr/hal-01575653 ; 3rd Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2016), Dec 2016, Osaka, Japan. pp.212-220 ; http://ttg.uni-saarland.de/vardial2016/ (2016)
BASE
Show details
6
Visualisierung von Ortsnamen im Deutschen Textarchiv
In: DHd 2016 ; https://halshs.archives-ouvertes.fr/halshs-01287931 ; DHd 2016, Mar 2016, Leipzig, Germany. pp.264-267 ; http://dhd2016.de/ (2016)
BASE
Show details
7
APIs in Digital Humanities: The Infrastructural Turn
In: Digital Humanities 2016 ; https://hal.archives-ouvertes.fr/hal-01348706 ; Digital Humanities 2016, Jul 2016, Cracovie, Poland. pp.93-96 ; http://dh2016.adho.org/ (2016)
BASE
Show details
8
Collection and Indexing of Tweets with a Geographical Focus
In: Tenth International Conference on Language Resources and Evaluation (LREC 2016) ; https://hal.archives-ouvertes.fr/hal-01323274 ; Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 2016, Portorož, Slovenia. pp.24-27 (2016)
BASE
Show details
9
Extraction and Visualization of Toponyms in Diachronic Text Corpora
In: Digital Humanities 2016 ; https://hal.archives-ouvertes.fr/hal-01348696 ; Digital Humanities 2016, Jul 2016, Cracovie, Poland. pp.732-734 ; http://dh2016.adho.org/ (2016)
BASE
Show details
10
Efficient construction of metadata-enhanced web corpora
In: Proceedings of the 10th Web as Corpus Workshop ; 10th Web as Corpus Workshop ; https://hal.archives-ouvertes.fr/hal-01371704 ; 10th Web as Corpus Workshop, Association for Computational Linguistics (ACL SIGWAC), Aug 2016, Berlin, Germany. pp.7-16, ⟨10.18653/v1/W16-2602⟩ (2016)
BASE
Show details
11
Collection, Description, and Visualization of the German Reddit Corpus
In: 2nd Workshop on Natural Language Processing for Computer-Mediated Communication ; https://hal.archives-ouvertes.fr/hal-01207311 ; 2nd Workshop on Natural Language Processing for Computer-Mediated Communication, Sep 2015, Essen, Germany. pp.7-11 ; https://sites.google.com/site/nlp4cmc2015/program (2015)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
11
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern