DE eng

Search in the Catalogues and Directories

Page: 1 2 3
Hits 1 – 20 of 43

1
Statistical parametric speech synthesis using conversational data and phenomena
Dall, Rasmus. - : The University of Edinburgh, 2017
Abstract: Statistical parametric text-to-speech synthesis currently relies on predefined and highly controlled prompts read in a “neutral” voice. This thesis presents work on utilising recordings of free conversation for the purpose of filled pause synthesis and as an inspiration for improved general modelling of speech for text-to-speech synthesis purposes. A corpus of both standard prompts and free conversation is presented and the potential usefulness of conversational speech as the basis for text-to-speech voices is validated. Additionally, through psycholinguistic experimentation it is shown that filled pauses can have potential subconscious benefits to the listener but that current text-to-speech voices cannot replicate these effects. A method for pronunciation variant forced alignment is presented in order to obtain a more accurate automatic speech segmentation something which is particularly bad for spontaneously produced speech. This pronunciation variant alignment is utilised not only to create a more accurate underlying acoustic model, but also as the driving force behind creating more natural pronunciation prediction at synthesis time. While this improves both the standard and spontaneous voices the naturalness of spontaneous speech based voices still lags behind the quality of voices based on standard read prompts. Thus, the synthesis of filled pauses is investigated in relation to specific phonetic modelling of filled pauses and through techniques for the mixing of standard prompts with spontaneous utterances in order to retain the higher quality of standard speech based voices while still utilising the spontaneous speech for filled pause modelling. A method for predicting where to insert filled pauses in the speech stream is also developed and presented, relying on an analysis of human filled pause usage and a mix of language modelling methods. The method achieves an insertion accuracy in close agreement with human usage. The various approaches are evaluated and their improvements documented throughout the thesis, however, at the end the resulting filled pause quality is assessed through a repetition of the psycholinguistic experiments and an evaluation of the compilation of all developed methods.
Keyword: filled pause synthesis; neutral voice; phonetic modelling; pronunciation variant alignment; psycholinguistic; text-to-speech synthesis
URL: http://hdl.handle.net/1842/29016
BASE
Hide details
2
Personalising speech-to-speech translation: Unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 27 (2013) 2, 420-437
OLC Linguistik
Show details
3
Analysis of unsupervised cross-lingual speaker adaptation for HMM-based speech synthesis using KLD-based transform mapping
In: Speech communication. - Amsterdam [u.a.] : Elsevier 54 (2012) 6, 703-714
BLLDB
OLC Linguistik
Show details
4
Talker discrimination across languages
In: Speech communication. - Amsterdam [u.a.] : Elsevier 54 (2012) 6, 781-790
BLLDB
OLC Linguistik
Show details
5
Speaker similarity evaluation of foreign-accented speech synthesis using HMM-based speaker adaptation
Wester, Mirjam; Karhila, Reima. - : IEEE, 2011
BASE
Show details
6
The EMIME Mandarin Bilingual Database
Wester, Mirjam; Liang, Hui. - : The University of Edinburgh, 2011
BASE
Show details
7
Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project
In: http://infoscience.epfl.ch/record/150620 (2010)
BASE
Show details
8
Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project
BASE
Show details
9
Speaker adaptation and the evaluation of speaker similarity in the EMIME speech-to-speech translation project
Wester, Mirjam; Dines, John; Gibson, Matthew. - : 7th ISCA Speech Synthesis Workshop, 2010
BASE
Show details
10
Cross-lingual talker discrimination
Wester, Mirjam. - 2010
BASE
Show details
11
The EMIME Bilingual Database
Wester, Mirjam. - : The University of Edinburgh, 2010
BASE
Show details
12
Cross-lingual talker discrimination
Wester, Mirjam. - : Proceedings of Interspeech 2010, 2010
BASE
Show details
13
Articulatory feature recognition using dynamic Bayesian networks
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 21 (2007) 4, 620-640
BLLDB
OLC Linguistik
Show details
14
Speech production knowledge in automatic speech recognition
BASE
Show details
15
An elitist approach to automatic articulatory-acoustic feature classification for phonetic characterization of spoken language
In: Speech communication. - Amsterdam [u.a.] : Elsevier 47 (2005) 3, 290-311
BLLDB
OLC Linguistik
Show details
16
An elitist approach to automatic articulatory-acoustic feature classification for phonetic characterization of spoken language.
BASE
Show details
17
Asynchronous Articulatory Feature Recognition Using Dynamic Bayesian networks
BASE
Show details
18
On the Articulatory Representation of Speech within the Evolving Transformation System Formalism
BASE
Show details
19
Pronunciation modeling for ASR - knowledge-based and data-derived methods
In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 17 (2003) 1, 69-86
OLC Linguistik
Show details
20
Syllable Classification Using Articulatory-Acoustic Features
Wester, Mirjam. - : International Speech Communication Association, 2003
BASE
Show details

Page: 1 2 3

Catalogues
1
0
7
0
0
0
0
Bibliographies
9
0
0
0
0
0
0
0
3
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
28
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern