Home Catalogue search

eng

Refine your search:
- Keyword
- Creator / Publisher:
- Year
- Medium:
  - Online (11)
- Type
- BLLDB-Access:
  - free (11)
  - subject to license (0)

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 11 of 11

1	Simple and Effective Unsupervised Speech Synthesis ...
	Liu, Alexander H.; Lai, Cheng-I Jeff; Hsu, Wei-Ning. - : arXiv, 2022
	BASE
	Show details

2	Text-Free Image-to-Speech Synthesis Using Learned Segmental Units ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; Glass, James; Harwath, David. - : Underline Science Inc., 2021
	BASE
	Show details

3	A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning
	Khurana, Sameer; Laurent, Antoine; Hsu, Wei-Ning...
	In: Interspeech 2020 ; https://hal.archives-ouvertes.fr/hal-02912029 ; Interspeech 2020, Oct 2020, Shanghai, China (2020)
	BASE
	Show details

4	A Convolutional Deep Markov Model for Unsupervised Speech Representation Learning ...
	Khurana, Sameer; Laurent, Antoine; Hsu, Wei-Ning; Chorowski, Jan; Lancucki, Adrian; Marxer, Ricard; Glass, James. - : arXiv, 2020
	Abstract: Probabilistic Latent Variable Models (LVMs) provide an alternative to self-supervised learning approaches for linguistic representation learning from speech. LVMs admit an intuitive probabilistic interpretation where the latent structure shapes the information extracted from the signal. Even though LVMs have recently seen a renewed interest due to the introduction of Variational Autoencoders (VAEs), their use for speech representation learning remains largely unexplored. In this work, we propose Convolutional Deep Markov Model (ConvDMM), a Gaussian state-space model with non-linear emission and transition functions modelled by deep neural networks. This unsupervised model is trained using black box variational inference. A deep convolutional neural network is used as an inference network for structured variational approximation. When trained on a large scale speech dataset (LibriSpeech), ConvDMM produces features that significantly outperform multiple self-supervised feature extracting methods on linear ... : Proceedings of Interspeech, 2020 ...
	Keyword: Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
	URL: https://arxiv.org/abs/2006.02547 https://dx.doi.org/10.48550/arxiv.2006.02547
	BASE
	Hide details

5	Learning Hierarchical Discrete Linguistic Units from Visually-Grounded Speech ...
	Harwath, David; Hsu, Wei-Ning; Glass, James. - : arXiv, 2019
	BASE
	Show details

6	Transfer Learning from Audio-Visual Grounding to Speech Recognition ...
	Hsu, Wei-Ning; Harwath, David; Glass, James. - : arXiv, 2019
	BASE
	Show details

7	Unsupervised Adaptation with Interpretable Disentangled Representations for Distant Conversational Speech Recognition ...
	Hsu, Wei-Ning; Tang, Hao; Glass, James. - : arXiv, 2018
	BASE
	Show details

8	Unsupervised Representation Learning of Speech for Dialect Identification ...
	Shon, Suwon; Hsu, Wei-Ning; Glass, James. - : arXiv, 2018
	BASE
	Show details

9	Unsupervised Learning of Disentangled and Interpretable Representations from Sequential Data ...
	Hsu, Wei-Ning; Zhang, Yu; Glass, James. - : arXiv, 2017
	BASE
	Show details

10	Learning Latent Representations for Speech Generation and Transformation ...
	Hsu, Wei-Ning; Zhang, Yu; Glass, James. - : arXiv, 2017
	BASE
	Show details

11	Recurrent Neural Network Encoder with Attention for Community Question Answering ...
	Hsu, Wei-Ning; Zhang, Yu; Glass, James. - : arXiv, 2016
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern