1 |
Homepage2Vec: Language-Agnostic Website Embedding and Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Classifying Dyads for Militarized Conflict Analysis
|
|
|
|
In: Proceedings of the 2021 Conference on Empirical Methods in Natural Language Processing (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Cognitive Network Topology and Optimization of the Mental Lexicon ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Linguistic effects on news headline success: Evidence from thousands of online field experiments (Registered Report Protocol)
|
|
|
|
In: PLoS One (2021)
|
|
BASE
|
|
Show details
|
|
7 |
On the Limitations of Cross-lingual Encoders as Exposed by Reference-Free Machine Translation Evaluation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
On the limitations of cross-lingual encoders as exposed by reference-free machine translation evaluation
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Crosslingual Document Embedding as Reduced-Rank Ridge Regression ...
|
|
|
|
Abstract:
There has recently been much interest in extending vector-based word representations to multiple languages, such that words can be compared across languages. In this paper, we shift the focus from words to documents and introduce a method for embedding documents written in any language into a single, language-independent vector space. For training, our approach leverages a multilingual corpus where the same concept is covered in multiple languages (but not necessarily via exact translations), such as Wikipedia. Our method, Cr5 (Crosslingual reduced-rank ridge regression), starts by training a ridge-regression-based classifier that uses language-specific bag-of-word features in order to predict the concept that a given document is about. We show that, when constraining the learned weight matrix to be of low rank, it can be factored to obtain the desired mappings from language-specific bags-of-words to language-independent embeddings. As opposed to most prior methods, which use pretrained monolingual word ... : In The Twelfth ACM International Conference on Web Search and Data Mining (WSDM '19) ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.1904.03922 https://arxiv.org/abs/1904.03922
|
|
BASE
|
|
Hide details
|
|
14 |
Causal Effects of Brevity on Style and Success in Social Media ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Message Distortion in Information Cascades
|
|
|
|
In: http://infoscience.epfl.ch/record/270657 (2019)
|
|
BASE
|
|
Show details
|
|
17 |
Reverse-Engineering Satire, or "Paper on Computational Humor Accepted despite Making Serious Advances"
|
|
|
|
In: http://infoscience.epfl.ch/record/271147 (2019)
|
|
BASE
|
|
Show details
|
|
18 |
Why the World Reads Wikipedia: Beyond English Speakers
|
|
|
|
In: http://infoscience.epfl.ch/record/270302 (2019)
|
|
BASE
|
|
Show details
|
|
19 |
Crosslingual Document Embedding as Reduced-Rank Ridge Regression
|
|
|
|
In: http://infoscience.epfl.ch/record/263893 (2019)
|
|
BASE
|
|
Show details
|
|
20 |
Churn Intent Detection in Multilingual Chatbot Conversations and Social Media ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|