7 |
Hybrid Hashtags: #YouKnowYoureAKiwiWhen Your Tweet Contains Māori and English
|
|
|
|
In: Front Artif Intell (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Contextualised approaches to embedding word senses
|
|
|
|
Abstract:
Vector representations of text are an essential tool for modern Natural Language Processing (NLP), and there has been much work devoted to finding effective methods for obtaining such representations. Most previously proposed methods derive vector representations for individual words, known as word embeddings. While word embeddings have enabled considerable advances in NLP, they have a significant theoretical drawback: many words have several, often completely unrelated meanings; it seems dubious to conflate these multiple meanings into a single point in semantic space. This drawback has inspired an alternative, "multi-sense" approach to representing words. In this approach, rather than learning a single vector for each word, multiple vectors, or "sense embeddings," are learned corresponding to the individual meanings of the word. While this approach has not in general surpassed the word embedding approach, it has proved beneficial for a number of tasks such as word similarity estimation and word sense induction. One of the most significant recent advances in NLP has been the development of "contextualised" word embedding models. Whereas word embeddings model the semantic properties of words in isolation, contextualised models represent the meanings of words in context. This enables them to capture some of the vast array of linguistic phenomena that occur above the word level. I propose a number of new methods for learning sense embeddings which exploit contextualised techniques, based on the underlying hypothesis that the probability of a word occurring in a given context is equal to the sum of the probabilities of its individual senses occurring in the context. I first validate this hypothesis by using it to derive a simple method for learning sense embeddings inspired by the Skip-gram model. I then present a method for extracting sense embeddings from a contextualised word embedding model. Finally I propose an end-to-end model for learning sense embeddings, and show that it comprehensively outperforms previous sense embedding models on the task of word sense induction, a standard task for evaluation of such models. To demonstrate the model's flexibility I apply it to some other word-sense related tasks with good results.
|
|
Keyword:
Computational Linguistics; Contextualised Embeddings; Natural Language Processing; Sense Embeddings
|
|
URL: https://hdl.handle.net/10289/13564
|
|
BASE
|
|
Hide details
|
|
9 |
Hybrid Hashtags: #YouKnowYoureAKiwiWhen Your Tweet Contains Māori and English
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Māori loanwords: a corpus of New Zealand English tweets
|
|
|
|
In: Vocab@Leuven 2019 (2019)
|
|
BASE
|
|
Show details
|
|
13 |
WASSA-2017 shared task on emotion intensity
|
|
|
|
In: WASSA 2017 (2017)
|
|
BASE
|
|
Show details
|
|
15 |
Acquiring and Exploiting Lexical Knowledge for Twitter Sentiment Analysis
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Determining word–emotion associations from tweets by multi-label classification
|
|
|
|
In: WI'16 (2016)
|
|
BASE
|
|
Show details
|
|
17 |
Building a Twitter opinion lexicon from automatically-annotated tweets
|
|
|
|
BASE
|
|
Show details
|
|
18 |
From opinion lexicons to sentiment classification of tweets and vice versa: a transfer learning approach
|
|
|
|
In: WI'16 (2016)
|
|
BASE
|
|
Show details
|
|
19 |
Annotate-Sample-Average (ASA): A New Distant Supervision Approach for Twitter Sentiment Analysis
|
|
|
|
In: 22nd European Conference on Artificial Intelligence (ECAI) (2016)
|
|
BASE
|
|
Show details
|
|
20 |
From unlabelled tweets to Twitter-specific opinion words
|
|
|
|
In: SIGIR '15 (2015)
|
|
BASE
|
|
Show details
|
|
|
|