21 |
Uncovering Probabilistic Implications in Typological Knowledge Bases ...
|
|
|
|
BASE
|
|
Show details
|
|
22 |
Back to the Future -- Sequential Alignment of Text Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
23 |
Combining Sentiment Lexica with a Multi-View Variational Autoencoder ...
|
|
|
|
BASE
|
|
Show details
|
|
25 |
Unsupervised Discovery of Gendered Language through Latent-Variable Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
27 |
What Do Language Representations Really Represent?
|
|
|
|
In: Bjerva, Johannes; Östling, Robert; Veiga, Maria Han; Tiedemann, Jörg; Augenstein, Isabelle (2019). What Do Language Representations Really Represent? Computational Linguistics, 45(2):381-389. (2019)
|
|
Abstract:
A neural language model trained on a text corpus can be used to induce distributed representations of words, such that similar words end up with similar representations. If the corpus is multilingual, the same model can be used to learn distributed representations of languages, such that similar languages end up with similar representations. We show that this holds even when the multilingual corpus has been translated into English, by picking up the faint signal left by the source languages. However, just as it is a thorny problem to separate semantic from syntactic similarity in word representations, it is not obvious what type of similarity is captured by language representations. We investigate correlations and causal relationships between language representations learned from translations on one hand, and genetic, geographical, and several levels of structural similarity between languages on the other. Of these, structural similarity is found to correlate most strongly with language representation similarity, whereas genetic relationships—a convenient benchmark used for evaluation in previous work—appears to be a confounding factor. Apart from implications about translation effects, we see this more generally as a case where NLP and linguistic typology can interact and benefit one another.
|
|
Keyword:
530 Physics; Artificial Intelligence; Computer Science Applications; Institute for Computational Science; Language and Linguistics; Linguistics and Language
|
|
URL: https://www.zora.uzh.ch/id/eprint/185185/ https://doi.org/10.1162/coli_a_00351 https://doi.org/10.5167/uzh-185185 https://www.zora.uzh.ch/id/eprint/185185/1/coli_a_00351.pdf
|
|
BASE
|
|
Hide details
|
|
28 |
On evaluating embedding models for knowledge base completion
|
|
|
|
BASE
|
|
Show details
|
|
29 |
Specializing distributional vectors of all words for lexical entailment
|
|
|
|
BASE
|
|
Show details
|
|
30 |
Copenhagen at CoNLL--SIGMORPHON 2018: Multilingual Inflection in Context with Explicit Morphosyntactic Decoding ...
|
|
|
|
BASE
|
|
Show details
|
|
31 |
Parameter sharing between dependency parsers for related languages ...
|
|
|
|
BASE
|
|
Show details
|
|
33 |
From Phonology to Syntax: Unsupervised Linguistic Typology at Different Levels with Language Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
34 |
Learning distributional token representations from visual features
|
|
|
|
BASE
|
|
Show details
|
|
35 |
Tracking Typological Traits of Uralic Languages in Distributed Language Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
36 |
Turing at SemEval-2017 task 8 : sequential approach to rumour stance classification with branch-LSTM
|
|
|
|
BASE
|
|
Show details
|
|
|
|