Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 11 of 11

1	Borderlands of text mapping: Experiments on Fontane's Brandenburg
	Barbaresi, Adrien
	In: Workshop INF-DH-2018 (Informatik und die Digital Humanities) ; https://hal.archives-ouvertes.fr/hal-01951880 ; Workshop INF-DH-2018 (Informatik und die Digital Humanities), Sep 2018, Berlin, Germany. ⟨10.18420/infdh2018-05⟩ (2018)
	BASE
	Show details

2	Data-Driven Identification of German Phrasal Compounds
	Barbaresi, Adrien; Hein, Katrin
	In: Text, Speech, and Dialogue ; https://hal.archives-ouvertes.fr/hal-01575651 ; Kamil Ekštein; Václav Matoušek. Text, Speech, and Dialogue, 10415, Springer International Publishing, pp.192-200, 2017, Lecture Notes in Computer Science, 978-3-319-64205-5. ⟨10.1007/978-3-319-64206-2_22⟩ ; https://link.springer.com/bookseries/558 (2017)
	BASE
	Show details

3	Discriminating between Similar Languages using Weighted Subword Features
	Barbaresi, Adrien
	In: Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017) ; https://hal.archives-ouvertes.fr/hal-01575656 ; Fourth Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2017), Association for Computational Linguistics (ACL), Apr 2017, Valence, Spain. pp.184-189, ⟨10.18653/v1/W17-1223⟩ ; http://ttg.uni-saarland.de/vardial2017/ (2017)
	BASE
	Show details

4	Bootstrapped OCR error detection for a less-resourced language variant
	Barbaresi, Adrien
	In: Proceedings of the 13th Conference on Natural Language Processing (KONVENS 2016) ; 13th Conference on Natural Language Processing (KONVENS 2016) ; https://hal.archives-ouvertes.fr/hal-01371689 ; 13th Conference on Natural Language Processing (KONVENS 2016), Sep 2016, Bochum, Germany. pp.21-26 ; https://www.linguistics.ruhr-uni-bochum.de/konvens16/ (2016)
	BASE
	Show details

5	An Unsupervised Morphological Criterion for Discriminating Similar Languages
	Barbaresi, Adrien
	In: 3rd Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2016) ; https://hal.archives-ouvertes.fr/hal-01575653 ; 3rd Workshop on NLP for Similar Languages, Varieties and Dialects (VarDial 2016), Dec 2016, Osaka, Japan. pp.212-220 ; http://ttg.uni-saarland.de/vardial2016/ (2016)
	BASE
	Show details

6	Visualisierung von Ortsnamen im Deutschen Textarchiv
	Barbaresi, Adrien
	In: DHd 2016 ; https://halshs.archives-ouvertes.fr/halshs-01287931 ; DHd 2016, Mar 2016, Leipzig, Germany. pp.264-267 ; http://dhd2016.de/ (2016)
	BASE
	Show details

7	APIs in Digital Humanities: The Infrastructural Turn
	Tasovac, Toma; Barbaresi, Adrien; Clérice, Thibault...
	In: Digital Humanities 2016 ; https://hal.archives-ouvertes.fr/hal-01348706 ; Digital Humanities 2016, Jul 2016, Cracovie, Poland. pp.93-96 ; http://dh2016.adho.org/ (2016)
	BASE
	Show details

8	Collection and Indexing of Tweets with a Geographical Focus
	Barbaresi, Adrien
	In: Tenth International Conference on Language Resources and Evaluation (LREC 2016) ; https://hal.archives-ouvertes.fr/hal-01323274 ; Tenth International Conference on Language Resources and Evaluation (LREC 2016), May 2016, Portorož, Slovenia. pp.24-27 (2016)
	Abstract: International audience ; This paper introduces a Twitter corpus currently focused geographically in order to (1) test selection and collection processes for a given region and (2) find a suitable database to query, filter, and visualize the tweets. Due to access restrictions, it is not possible to retrieve all available tweets, which is why corpus construction implies a series of decisions described below. The corpus focuses on Austrian users, as data collection grounds on a two-tier detection process addressing corpus construction and user location issues. The emphasis lies on short messages whose sender mentions a place in Austria as his/her hometown or tweets from places located in Austria. The resulting user base is then queried and enlarged using focused crawling and random sampling, so that the corpus is refined and completed in the way of a monitor corpus. Its current volume is 21.7 million tweets from approximately 125,000 users. The tweets are indexed using Elasticsearch and queried via the Kibana frontend, which allows for queries on metadata as well as for the visualization of geolocalized tweets (currently about 3.3% of the collection).
	Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-WB]Computer Science [cs]/Web; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; Computer-Mediated Communication; Database Solutions; Visualization; Web Corpus Construction
	URL: https://hal.archives-ouvertes.fr/hal-01323274v3/document https://hal.archives-ouvertes.fr/hal-01323274 https://hal.archives-ouvertes.fr/hal-01323274v3/file/Barbaresi_CMLC2016_Twitter_archive.pdf
	BASE
	Hide details

9	Extraction and Visualization of Toponyms in Diachronic Text Corpora
	Barbaresi, Adrien; Biber, Hanno
	In: Digital Humanities 2016 ; https://hal.archives-ouvertes.fr/hal-01348696 ; Digital Humanities 2016, Jul 2016, Cracovie, Poland. pp.732-734 ; http://dh2016.adho.org/ (2016)
	BASE
	Show details

10	Efficient construction of metadata-enhanced web corpora
	Barbaresi, Adrien
	In: Proceedings of the 10th Web as Corpus Workshop ; 10th Web as Corpus Workshop ; https://hal.archives-ouvertes.fr/hal-01371704 ; 10th Web as Corpus Workshop, Association for Computational Linguistics (ACL SIGWAC), Aug 2016, Berlin, Germany. pp.7-16, ⟨10.18653/v1/W16-2602⟩ (2016)
	BASE
	Show details

11	Collection, Description, and Visualization of the German Reddit Corpus
	Barbaresi, Adrien
	In: 2nd Workshop on Natural Language Processing for Computer-Mediated Communication ; https://hal.archives-ouvertes.fr/hal-01207311 ; 2nd Workshop on Natural Language Processing for Computer-Mediated Communication, Sep 2015, Essen, Germany. pp.7-11 ; https://sites.google.com/site/nlp4cmc2015/program (2015)
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern