DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 38

1
BAS Edition of German Distant Speech Data Corpus 2014/2015
Stephan Radeck-Arneth; Benjamin Milde; Arvid Lange. - : Bavarian Archive for Speech Signals (BAS), 2022
BASE
Show details
2
Introducing Various Semantic Models for Amharic: Experimentation and Evaluation with Multiple Tasks and Datasets
In: Future Internet ; Volume 13 ; Issue 11 (2021)
BASE
Show details
3
Using Semantics for Granularities of Tokenization
In: Computational Linguistics, Vol 44, Iss 3, Pp 483-524 (2018) (2018)
Abstract: Depending on downstream applications, it is advisable to extend the notion of tokenization from low-level character-based token boundary detection to identification of meaningful and useful language units. This entails both identifying units composed of several single words that form a several single words that form a, as well as splitting single-word compounds into their meaningful parts. In this article, we introduce unsupervised and knowledge-free methods for these two tasks. The main novelty of our research is based on the fact that methods are primarily based on distributional similarity, of which we use two flavors: a sparse count-based and a dense neural-based distributional semantic model. First, we introduce DRUID, which is a method for detecting MWEs. The evaluation on MWE-annotated data sets in two languages and newly extracted evaluation data sets for 32 languages shows that DRUID compares favorably over previous methods not utilizing distributional information. Second, we present SECOS, an algorithm for decompounding close compounds. In an evaluation of four dedicated decompounding data sets across four languages and on data sets extracted from Wiktionary for 14 languages, we demonstrate the superiority of our approach over unsupervised baselines, sometimes even matching the performance of previous language-specific and supervised methods. In a final experiment, we show how both decompounding and MWE information can be used in information retrieval. Here, we obtain the best results when combining word information with MWEs and the compound parts in a bag-of-words retrieval set-up. Overall, our methodology paves the way to automatic detection of lexical units beyond standard tokenization techniques without language-specific preprocessing steps such as POS tagging.
Keyword: Computational linguistics. Natural language processing; P98-98.5
URL: https://doaj.org/article/f739e90bb4a24f6794543bcd4b417072
https://doi.org/10.1162/coli_a_00325
BASE
Hide details
4
Using Pseudowords for Algorithm Comparison: An Evaluation Framework for Graph-based Word Sense Induction
Chris, Biemann; Cecchini, Flavio Massimiliano; Martin, Riedl. - : Linköping University Electronic Press, 2017. : country:SWE, 2017. : place:Linköping, 2017
BASE
Show details
5
Towards a Historical Text Re-use Detection
Martin, Müller; Franzini, Greta; Marco, Büchler. - : Springer International Publishing, 2014. : country:CHE, 2014. : place:Cham, 2014
BASE
Show details
6
Webbasierte linguistische Forschung: Möglichkeiten und Begrenzungen beim Umgang mit Massendaten
In: Linguistik Online, Vol 61, Iss 4 (2014) (2014)
BASE
Show details
7
SemEval-2013 task 5: Evaluating phrasal semantics
In: http://www.aclweb.org/anthology/S/S13/S13-2007.pdf (2013)
BASE
Show details
8
Using Distributional Similarity for Lexical Expansion in Knowledge-based Word Sense Disambiguation
In: http://aclweb.org/anthology/C/C12/C12-1109.pdf (2012)
BASE
Show details
9
Ukp: Computing semantic textual similarity by combining multiple content similarity measures
In: http://aclweb.org/anthology//S/S12/S12-1059.pdf (2012)
BASE
Show details
10
Quantifying semantics using complex network analysis
In: http://aclweb.org/anthology/C/C12/C12-1017.pdf (2012)
BASE
Show details
11
ASV Toolbox – A Modular Collection of Language Exploration Tools
In: http://asv.informatik.uni-leipzig.de/publication/file/94/biemann-etal-08-toolbox.pdf (2008)
BASE
Show details
12
ASV Toolbox – A Modular Collection of Language Exploration Tools
In: http://www.lrec-conf.org/proceedings/lrec2008/pdf/447_paper.pdf (2008)
BASE
Show details
13
workshop on Graph-based Algorithms for Natural Language Processing Workshop chairs:
In: http://www.aclweb.org/anthology-new/W/W08/W08-20.pdf (2008)
BASE
Show details
14
Unsupervised part-of-speech tagging employing efficient graph clustering
In: http://wortschatz.uni-leipzig.de/~cbiemann/pub/2006/unsupos_graph_coling06SRW.pdf (2006)
BASE
Show details
15
Automatic extension of feature-based semantic lexicons via contextual attributes
In: http://pi7.fernuni-hagen.de/osswald/papers/gfkl05.pdf (2006)
BASE
Show details
16
Unsupervised part-of-speech tagging employing efficient graph clustering
In: http://acl.ldc.upenn.edu/P/P06/P06-3002.pdf (2006)
BASE
Show details
17
Unsupervised part-of-speech tagging employing efficient graph clustering
In: http://machinelearningtext.pbworks.com/w/file/fetch/48158637/UnsupPOSp7-biemann.pdf (2006)
BASE
Show details
18
Rigorous dimensionality reduction through linguistically motivated feature selection for text categorisation
In: http://wortschatz.uni-leipzig.de/~fwitschel/papers/nodalida.pdf (2005)
BASE
Show details
19
Disentangling from babylonian confusion – unsupervised language identification
In: http://wortschatz.uni-leipzig.de/~cbiemann/pub/2005/cicling05.pdf (2005)
BASE
Show details
20
Automatic acquisition of paradigmatic relations using iterated co-occurrences
In: http://wortschatz.uni-leipzig.de/~sbordag/papers/BiemannBordagQuasthoffAutomatic04.pdf (2004)
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
38
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern