DE eng

Search in the Catalogues and Directories

Hits 1 – 11 of 11

1
Measuring the Impact of Individual Domain Factors in Self-Supervised Pre-Training ...
BASE
Show details
2
Simple and Effective Unsupervised Speech Synthesis ...
BASE
Show details
3
Textless Speech-to-Speech Translation on Real Data ...
BASE
Show details
4
HuBERT: Self-Supervised Speech Representation Learning by Masked Prediction of Hidden Units ...
BASE
Show details
5
Textless Speech Emotion Conversion using Discrete and Decomposed Representations ...
BASE
Show details
6
A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning ...
BASE
Show details
7
Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech ...
BASE
Show details
8
Transfer Learning from Audio-Visual Grounding to Speech Recognition ...
Abstract: Transfer learning aims to reduce the amount of data required to excel at a new task by re-using the knowledge acquired from learning other related tasks. This paper proposes a novel transfer learning scenario, which distills robust phonetic features from grounding models that are trained to tell whether a pair of image and speech are semantically correlated, without using any textual transcripts. As semantics of speech are largely determined by its lexical content, grounding models learn to preserve phonetic information while disregarding uncorrelated factors, such as speaker and channel. To study the properties of features distilled from different layers, we use them as input separately to train multiple speech recognition models. Empirical results demonstrate that layers closer to input retain more phonetic information, while following layers exhibit greater invariance to domain shift. Moreover, while most previous studies include training data for speech recognition for feature extractor training, our ... : Accepted to Interspeech 2019. 4 pages, 2 figures ...
Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
URL: https://arxiv.org/abs/1907.04355
https://dx.doi.org/10.48550/arxiv.1907.04355
BASE
Hide details
9
Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition ...
Hsu, Wei-Ning; Tang, Hao; Glass, James. - : arXiv, 2018
BASE
Show details
10
Unsupervised Representation Learning of Speech for Dialect Identification ...
BASE
Show details
11
Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data ...
Hsu, Wei-Ning; Zhang, Yu; Glass, James. - : arXiv, 2017
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
11
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern