Page: 1... 3 4 5 6 7 8 9 10 11... 690
121 |
Emotion Intensity and its Control for Emotional Voice Conversion ...
|
|
|
|
BASE
|
|
Show details
|
|
122 |
Suum Cuique: Studying Bias in Taboo Detection with a Community Perspective ...
|
|
|
|
BASE
|
|
Show details
|
|
123 |
An Evaluation Dataset for Legal Word Embedding: A Case Study On Chinese Codex ...
|
|
|
|
BASE
|
|
Show details
|
|
124 |
Application of Quantum Density Matrix in Classical Question Answering and Classical Image Classification ...
|
|
|
|
BASE
|
|
Show details
|
|
125 |
Perturbations in the Wild: Leveraging Human-Written Text Perturbations for Realistic Adversarial Attack and Defense ...
|
|
|
|
BASE
|
|
Show details
|
|
126 |
Detecting early signs of depression in the conversational domain: The role of transfer learning in low-resource scenarios ...
|
|
|
|
BASE
|
|
Show details
|
|
127 |
LoL: A Comparative Regularization Loss over Query Reformulation Losses for Pseudo-Relevance Feedback ...
|
|
|
|
BASE
|
|
Show details
|
|
128 |
Counterfactual Explanations for Natural Language Interfaces ...
|
|
|
|
BASE
|
|
Show details
|
|
130 |
Probing Speech Emotion Recognition Transformers for Linguistic Knowledge ...
|
|
|
|
Abstract:
Large, pre-trained neural networks consisting of self-attention layers (transformers) have recently achieved state-of-the-art results on several speech emotion recognition (SER) datasets. These models are typically pre-trained in self-supervised manner with the goal to improve automatic speech recognition performance -- and thus, to understand linguistic information. In this work, we investigate the extent in which this information is exploited during SER fine-tuning. Using a reproducible methodology based on open-source tools, we synthesise prosodically neutral speech utterances while varying the sentiment of the text. Valence predictions of the transformer model are very reactive to positive and negative sentiment content, as well as negations, but not to intensifiers or reducers, while none of those linguistic features impact arousal or dominance. These findings show that transformers can successfully leverage linguistic information to improve their valence predictions, and that linguistic analysis should ... : This work has been submitted for publication to Interspeech 2022 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
|
|
URL: https://dx.doi.org/10.48550/arxiv.2204.00400 https://arxiv.org/abs/2204.00400
|
|
BASE
|
|
Hide details
|
|
131 |
GreaseLM: Graph REASoning Enhanced Language Models for Question Answering ...
|
|
|
|
BASE
|
|
Show details
|
|
132 |
Position-based Prompting for Health Outcome Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
133 |
Dilated Convolutional Neural Networks for Lightweight Diacritics Restoration ...
|
|
|
|
BASE
|
|
Show details
|
|
134 |
GatorTron: A Large Clinical Language Model to Unlock Patient Information from Unstructured Electronic Health Records ...
|
|
|
|
BASE
|
|
Show details
|
|
135 |
FALCON: Fast Visual Concept Learning by Integrating Images, Linguistic descriptions, and Conceptual Relations ...
|
|
|
|
BASE
|
|
Show details
|
|
136 |
Low-dimensional representation of infant and adult vocalization acoustics ...
|
|
|
|
BASE
|
|
Show details
|
|
137 |
Chain-based Discriminative Autoencoders for Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
138 |
Error Correction in ASR using Sequence-to-Sequence Models ...
|
|
|
|
BASE
|
|
Show details
|
|
139 |
Filter-based Discriminative Autoencoders for Children Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
140 |
Speaking clearly improves speech segmentation by statistical learning under optimal listening conditions ...
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1... 3 4 5 6 7 8 9 10 11... 690
|
|