DE eng

Search in the Catalogues and Directories

Hits 1 – 19 of 19

1
Non-Parametric Bayesian Subspace Models for Acoustic Unit Discovery
In: https://hal.archives-ouvertes.fr/hal-03467205 ; 2021 (2021)
BASE
Show details
2
Speaker embeddings by modeling channel-wise correlations ...
Abstract: Speaker embeddings extracted with deep 2D convolutional neural networks are typically modeled as projections of first and second order statistics of channel-frequency pairs onto a linear layer, using either average or attentive pooling along the time axis. In this paper we examine an alternative pooling method, where pairwise correlations between channels for given frequencies are used as statistics. The method is inspired by style-transfer methods in computer vision, where the style of an image, modeled by the matrix of channel-wise correlations, is transferred to another image, in order to produce a new image having the style of the first and the content of the second. By drawing analogies between image style and speaker characteristics, and between image content and phonetic sequence, we explore the use of such channel-wise correlations features to train a ResNet architecture in an end-to-end fashion. Our experiments on VoxCeleb demonstrate the effectiveness of the proposed pooling method in speaker ... : Accepted at Interspeech 2021 ...
Keyword: Audio and Speech Processing eess.AS; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering
URL: https://dx.doi.org/10.48550/arxiv.2104.02571
https://arxiv.org/abs/2104.02571
BASE
Hide details
3
Bayesian multilingual topic model for zero-shot cross-lingual topic identification ...
BASE
Show details
4
A Hierarchical Subspace Model for Language-Attuned Acoustic Unit Discovery ...
BASE
Show details
5
AUTOMATIC LEARNING OF A PHONOLOGICAL SYSTEM: A CASE STUDY ON MBOSHI
In: International Conference Language Technologies for All (LT4ALL) ; https://hal.archives-ouvertes.fr/hal-03478242 ; International Conference Language Technologies for All (LT4ALL), 2019, Paris, France (2019)
BASE
Show details
6
Short-duration Speaker Verification (SdSV) Challenge 2021: the Challenge Evaluation Plan ...
BASE
Show details
7
Bayesian models for unit discovery on a very low resource language
In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-01709589 ; IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), Apr 2018, Calgary, Alberta, Canada (2018)
BASE
Show details
8
An Empirical Evaluation of Zero Resource Acoustic Unit Discovery ...
Liu, Chunxi; Yang, Jinyi; Sun, Ming. - : arXiv, 2017
BASE
Show details
9
Approaches to automatic lexicon learning with limited training examples
In: http://infoscience.epfl.ch/record/203451 (2014)
BASE
Show details
10
Subspace Gaussian Mixture Models for speech recognition
In: http://infoscience.epfl.ch/record/203448 (2014)
BASE
Show details
11
Multilingual acoustic modeling for speech recognition based on subspace Gaussian Mixture Models
In: http://infoscience.epfl.ch/record/203450 (2014)
BASE
Show details
12
The Kaldi Speech Recognition Toolkit
In: http://infoscience.epfl.ch/record/192584 (2013)
BASE
Show details
13
The Kaldi Speech Recognition Toolkit
In: http://infoscience.epfl.ch/record/192761 (2013)
BASE
Show details
14
Transcribing meetings with the AMIDA systems
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 20 (2012) 2, 486-498
BLLDB
OLC Linguistik
Show details
15
The subspace Gaussian mixture model - a structured model for speech recognition
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 25 (2011) 2, 404-439
BLLDB
OLC Linguistik
Show details
16
Application of speaker- and language identification state-of-the-art techniques for emotion recognition
In: Speech communication. - Amsterdam [u.a.] : Elsevier 53 (2011) 9-10, 1172-1185
BLLDB
OLC Linguistik
Show details
17
Improving the Capacity of Language Recognition Systems to Handle Rare Languages Using Radio Broadcast Data
In: DTIC (2011)
BASE
Show details
18
Fusion of heterogeneous speaker recognition systems in the STBU submission for the NIST speaker recognition evaluation 2006
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 15 (2007) 7, 2072-2084
BLLDB
OLC Linguistik
Show details
19
Analysis of feature extraction and channel compensation in a GMM speaker recognition system
In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 15 (2007) 7, 1979-1986
BLLDB
OLC Linguistik
Show details

Catalogues
0
0
5
0
0
0
0
Bibliographies
5
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
14
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern