1 |
Proceedings of the Workshop on Challenges in the Management of Large Corpora (CMLC-9) 2021. Limerick, 12 July 2021 (Online-Event) ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
The FAIR Index of CMC Corpora
|
|
|
|
In: CMC Corpora through the prism of Digital Humanities ; https://hal.archives-ouvertes.fr/hal-03121698 ; CMC Corpora through the prism of Digital Humanities, 2020 (2020)
|
|
BASE
|
|
Show details
|
|
3 |
Proceedings of the LREC 2020: 8th Workshop on Challenges in the Management of Large Corpora (CMLC-8)
|
|
In: Proceedings of the LREC 2020: 8th Workshop on Challenges in the Management of Large Corpora (CMLC-8). Edited by: Bański, Piotr; Barbaresi, Adrien; Clematide, Simon; Kupietz, Marc; Lüngen, Harald; Pisetta, Ines (2020). Marseille, France: European Language Ressources Association. (2020)
|
|
BASE
|
|
Show details
|
|
4 |
Modelling Large Parallel Corpora: The Zurich Parallel Corpus Collection
|
|
|
|
In: Graën, Johannes; Kew, Tannon; Shaitarova, Anastassia; Volk, Martin (2019). Modelling Large Parallel Corpora: The Zurich Parallel Corpus Collection. In: Challenges in the Management of Large Corpora (CMLC-7), Cardiff, Wales, 22 July 2019 - 22 July 2019. (2019)
|
|
BASE
|
|
Show details
|
|
5 |
Types and annotation of reply relations in computer-mediated communication
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Connecting Resources: Which Issues Have to be Solved to Integrate CMC Corpora from Heterogeneous Sources and for Different Languages?
|
|
|
|
In: 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17) ; https://hal.archives-ouvertes.fr/hal-01918880 ; 5th Conference on CMC and Social Media Corpora for the Humanities (cmccorpora17), Oct 2017, Bolzano, Italy. pp.52-55 ; https://doi.org/10.5281/zenodo.1040713 (2017)
|
|
BASE
|
|
Show details
|
|
7 |
Closing a gap in the language resources landscape : Groundwork and best practices from projects on computer-mediated communication in four European countries.
|
|
|
|
In: CLARIN Annual Conference 2016 ; https://hal.archives-ouvertes.fr/hal-01379621 ; CLARIN Annual Conference 2016, Oct 2016, Aix-en-Provence, France. 136, Linköping Electronic Conference Proceedings, pp.1-19, 2017, Selected papers from the CLARIN Annual Conference 2016, 978-91-7685-499-0 ; http://www.ep.liu.se/ecp/contents.asp?issue=136 (2017)
|
|
BASE
|
|
Show details
|
|
8 |
Connecting Resources: Which Issues Have To Be Solved To Integrate Cmc Corpora From Heterogeneous Sources And For Different Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Connecting Resources: Which Issues Have To Be Solved To Integrate Cmc Corpora From Heterogeneous Sources And For Different Languages? ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Efficient Exploration of Translation Variants in Large Multiparallel Corpora Using a Relational Database
|
|
|
|
In: Graën, Johannes; Clematide, Simon; Volk, Martin (2016). Efficient Exploration of Translation Variants in Large Multiparallel Corpora Using a Relational Database. In: 4th Workshop on the Challenges in the Management of Large Corpora, Portorož, 28 May 2016 - 28 May 2016, 20-23. (2016)
|
|
BASE
|
|
Show details
|
|
12 |
Integrating corpora of computer-mediated communication in CLARIN-D: Results from the curation project ChatCorpus2CLARIN
|
|
|
|
BASE
|
|
Show details
|
|
13 |
TEI across corpora, languages and genres: Towards a standard for the representation of social media and computer-mediated communication
|
|
|
|
In: Text Encoding Initiative: connect, animate, innovate. 2015 Annual Conference and Members’ Meeting of the TEI Consortium ; https://halshs.archives-ouvertes.fr/halshs-01222982 ; Text Encoding Initiative: connect, animate, innovate. 2015 Annual Conference and Members’ Meeting of the TEI Consortium, TEI Consortium, Oct 2015, Lyon, France ; http://tei2015.huma-num.fr (2015)
|
|
BASE
|
|
Show details
|
|
14 |
Challenges in the alignment, management and exploitation of large and richly annotated multi-parallel corpora
|
|
|
|
In: Graën, Johannes; Clematide, Simon (2015). Challenges in the alignment, management and exploitation of large and richly annotated multi-parallel corpora. In: 3rd Workshop on the Challenges in the Management of Large Corpora, Lancaster, 20 July 2015 - 20 July 2015, 15-20. (2015)
|
|
BASE
|
|
Show details
|
|
15 |
Adding value to CMC corpora: CLARINification and part-of-speech annotation of the Dortmund Chat Corpus
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Das Deutsche Referenzkorpus DEREKO im Jubiläumsjahr 2014
|
|
|
|
IDS Mannheim
|
|
17 |
Building Linguistic Corpora from Wikipedia Articles and Discussions
|
|
|
|
In: Journal for Language Technology and Computational Linguistics 29 (2014) 2, 59-82
|
|
IDS OBELEX meta
|
|
Show details
|
|
18 |
Mining corpora of computer-mediated communication: Analysis of linguistic features in Wikipedia talk pages using machine learning methods
|
|
|
|
Abstract:
Machine learning methods offer a great potential to automatically investigate large amounts of data in the humanities. Our contribution to the workshop reports about ongoing work in the BMBF project KobRA (http://www.kobra.tu-dortmund.de) where we apply machine learning methods to the analysis of big corpora in language-focused research of computer-mediated communication (CMC). At the workshop, we will discuss first results from training a Support Vector Machine (SVM) for the classification of selected linguistic features in talk pages of the German Wikipedia corpus in DeReKo provided by the IDS Mannheim. We will investigate different representations of the data to integrate complex syntactic and semantic information for the SVM. The results shall foster both corpus-based research of CMC and the annotation of linguistic features in CMC corpora.
|
|
Keyword:
Computerunterstützte Kommunikation; ddc:400; Korpus; Onlinecommunity
|
|
URL: https://hildok.bsz-bw.de/files/276/01_06.pdf https://hildok.bsz-bw.de/frontdoor/index/index/docId/276 https://nbn-resolving.org/urn:nbn:de:gbv:hil2-opus-2930
|
|
BASE
|
|
Hide details
|
|
19 |
DeReKo-Archiv jetzt mit fünf Milliarden Textwörtern : Zum größten digitalen Textarchiv für deutsche Texte der Gegenwart
|
|
|
|
IDS Mannheim
|
|
20 |
Sprachressourcen in der Lehre : Erfahrungen, Einsatzszenarien, Nutzerwünsche [<Journal>]
|
|
|
|
IDS Mannheim
|
|
|
|