Page: 1 2 3 4 5 6 7... 114
42 |
An Experimental Approach to the Perception of Empathy in Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
43 |
Ressources for End-to-End French Text-to-Speech Blizzard challenge ...
|
|
|
|
BASE
|
|
Show details
|
|
44 |
Ressources for End-to-End French Text-to-Speech Blizzard challenge ...
|
|
|
|
BASE
|
|
Show details
|
|
45 |
An Experimental Approach to the Perception of Empathy in Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
46 |
Generative Adversarial Networks for Cross-Lingual Voice Conversion
|
|
|
|
BASE
|
|
Show details
|
|
47 |
Sequence-to-Sequence Acoustic Modeling with Semi-Stepwise Monotonic Attention for Speech Synthesis
|
|
|
|
In: Applied Sciences ; Volume 11 ; Issue 21 (2021)
|
|
BASE
|
|
Show details
|
|
48 |
Acoustic Word Embeddings for End-to-End Speech Synthesis
|
|
|
|
In: Applied Sciences ; Volume 11 ; Issue 19 (2021)
|
|
Abstract:
The most recent end-to-end speech synthesis systems use phonemes as acoustic input tokens and ignore the information about which word the phonemes come from. However, many words have their specific prosody type, which may significantly affect the naturalness. Prior works have employed pre-trained linguistic word embeddings as TTS system input. However, since linguistic information is not directly relevant to how words are pronounced, TTS quality improvement of these systems is mild. In this paper, we propose a novel and effective way of jointly training acoustic phone and word embeddings for end-to-end TTS systems. Experiments on the LJSpeech dataset show that the acoustic word embeddings dramatically decrease both the training and validation loss in phone-level prosody prediction. Subjective evaluations on naturalness demonstrate that the incorporation of acoustic word embeddings can significantly outperform both pure phone-based system and the TTS system with pre-trained linguistic word embedding.
|
|
Keyword:
acoustic input tokens; naturalness; speech synthesis; word embedding
|
|
URL: https://doi.org/10.3390/app11199010
|
|
BASE
|
|
Hide details
|
|
49 |
Challenges to Internationalisation of University Programmes: A Systematic Thematic Synthesis of Qualitative Research on Learner-Centred English Medium Instruction (EMI) Pedagogy
|
|
|
|
In: Sustainability ; Volume 13 ; Issue 22 (2021)
|
|
BASE
|
|
Show details
|
|
50 |
Discriminative Multi-Stream Postfilters Based on Deep Learning for Enhancing Statistical Parametric Speech Synthesis
|
|
|
|
In: Biomimetics ; Volume 6 ; Issue 1 (2021)
|
|
BASE
|
|
Show details
|
|
51 |
Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare
|
|
|
|
In: Electronics ; Volume 10 ; Issue 19 (2021)
|
|
BASE
|
|
Show details
|
|
52 |
Integrating a voice analysis-synthesis system with a TTS framework for controlling affect and speaker identity ; 2021 32nd Irish Signals and Systems Conference (ISSC)
|
|
|
|
BASE
|
|
Show details
|
|
53 |
Intercultural competence in teacher preparation programs in the United States and Canada: A meta-synthesis study
|
|
|
|
In: University of South Florida M3 Center Publishing (2021)
|
|
BASE
|
|
Show details
|
|
54 |
Acoustic analysis and measurements of distorted speech in the NZ population
|
|
|
|
BASE
|
|
Show details
|
|
55 |
Acoustic analysis and measurements of distorted speech in the NZ population
|
|
|
|
BASE
|
|
Show details
|
|
56 |
Acoustic analysis and measurements of distorted speech in the NZ population
|
|
|
|
BASE
|
|
Show details
|
|
57 |
“Song-advantage” or “Cost of Singing”? : A Research Synthesis of Classroom-based Intervention Studies Applying Lyrics-based Language Teaching (1972–2019)
|
|
|
|
BASE
|
|
Show details
|
|
58 |
“Song-advantage” or “Cost of Singing”? A Research Synthesis of Classroom-based Intervention Studies Applying Lyrics-based Language Teaching (1972–2019)
|
|
|
|
BASE
|
|
Show details
|
|
59 |
Thirty years of data-driven learning: Taking stock and charting new directions over time
|
|
Boulton, Alex; Vyatkina, Nina. - : University of Hawaii National Foreign Language Resource Center, 2021. : Center for Language & Technology, 2021. : (co-sponsored by Center for Open Educational Resources and Language Learning, University of Texas at Austin), 2021
|
|
BASE
|
|
Show details
|
|
60 |
CONTINUOUS AMERICAN SIGN LANGUAGE TRANSLATION WITH ENGLISH SPEECH SYNTHESIS USING ENCODER-DECODER APPROACH
|
|
|
|
BASE
|
|
Show details
|
|
Page: 1 2 3 4 5 6 7... 114
|
|