DE eng

Search in the Catalogues and Directories

Hits 1 – 16 of 16

1
Speech Resynthesis from Discrete Disentangled Self-Supervised Representations
In: INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association ; https://hal.inria.fr/hal-03329245 ; INTERSPEECH 2021 - Annual Conference of the International Speech Communication Association, Aug 2021, Brno, Czech Republic (2021)
BASE
Show details
2
On Generative Spoken Language Modeling from Raw Audio
In: EISSN: 2307-387X ; Transactions of the Association for Computational Linguistics ; https://hal.inria.fr/hal-03329219 ; Transactions of the Association for Computational Linguistics, The MIT Press, 2021 (2021)
BASE
Show details
3
Generative Spoken Language Modeling from Raw Audio ...
BASE
Show details
4
Generative Spoken Language Modeling from Raw Audio ...
BASE
Show details
5
Textless Speech Emotion Conversion using Discrete and Decomposed Representations ...
Abstract: Speech emotion conversion is the task of modifying the perceived emotion of a speech utterance while preserving the lexical content and speaker identity. In this study, we cast the problem of emotion conversion as a spoken language translation task. We use a decomposition of the speech signal into discrete learned representations, consisting of phonetic-content units, prosodic features, speaker, and emotion. First, we modify the speech content by translating the phonetic-content units to a target emotion, and then predict the prosodic features based on these units. Finally, the speech waveform is generated by feeding the predicted representations into a neural vocoder. Such a paradigm allows us to go beyond spectral and parametric changes of the signal, and model non-verbal vocalizations, such as laughter insertion, yawning removal, etc. We demonstrate objectively and subjectively that the proposed method is vastly superior to current approaches and even beats text-based systems in terms of perceived emotion ...
Keyword: Artificial Intelligence cs.AI; Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Machine Learning cs.LG; Sound cs.SD
URL: https://arxiv.org/abs/2111.07402
https://dx.doi.org/10.48550/arxiv.2111.07402
BASE
Hide details
6
Self-Supervised Contrastive Learning for Unsupervised Phoneme Segmentation ...
BASE
Show details
7
Phoneme Boundary Detection using Learnable Segmental Features ...
BASE
Show details
8
The influence of lexical selection disruptions on articulation
In: J Exp Psychol Learn Mem Cogn (2018)
BASE
Show details
9
Automatic Measurement of Pre-aspiration ...
BASE
Show details
10
Learning Similarity Functions for Pronunciation Variations ...
BASE
Show details
11
SEQUENCE SEGMENTATION USING JOINT RNN AND STRUCTURED PREDICTION MODELS
BASE
Show details
12
Fine-grained Analysis of Sentence Embeddings Using Auxiliary Prediction Tasks ...
BASE
Show details
13
Sequence Segmentation Using Joint RNN and Structured Prediction Models ...
BASE
Show details
14
Automatic measurement of vowel duration via structured prediction ...
BASE
Show details
15
Automatic measurement of vowel duration via structured prediction
Adi, Yossi; Keshet, Joseph; Cibelli, Emily. - : Acoustical Society of America, 2016
BASE
Show details
16
VOWEL DURATION MEASUREMENT USING DEEP NEURAL NETWORKS
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
16
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern