3 |
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
|
|
|
|
In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
|
|
Abstract:
International audience ; This research aims to build a prosodic boundary prediction model for improving the naturalness of Vietnamese speech synthesis. This model can be used directly to predict prosodic boundaries in the synthesis phase of the statistical parametric or end-to-end speech systems. Beside conventional features related to Part-Of-Speech (POS), this paper proposes two efficient features to predict prosodic boundaries: syntactic blocks and syntactic links, based on a thorough analysis of a Vietnamese dataset. Syntactic blocks are syntactic phrases whose sizes are bounded in their constituent syntactic tree. A syntactic link of two adjacent words is calculated based on the distance between them in the syntax tree. The experimental results show that the two proposed predictors improve the quality of the boundary prediction model using a decision tree classification algorithm, about 36.4% (F1 score) higher than the model with only POS features. The final boundary prediction model with POS, syntactic block, and syntactic link features using the LightGBM algorithm gives the best F1-score results at 87.0% in test data. The proposed model helps the TTS systems, developed by either HMM-based, DNN-based, or End-to-end speech synthesis techniques, improve about 0.3 MOS points (i.e. 6 to 10%) compared to the ones without the proposed model.
|
|
Keyword:
[INFO.INFO-HC]Computer Science [cs]/Human-Computer Interaction [cs.HC]; [INFO.INFO-SD]Computer Science [cs]/Sound [cs.SD]; [INFO.INFO-TS]Computer Science [cs]/Signal and Image Processing; [SHS.LANGUE]Humanities and Social Sciences/Linguistics; pause prediction; prosodic boundary; Prosody modeling; speech synthesis; Text-To-Speech; Vietnamese
|
|
URL: https://hal.archives-ouvertes.fr/hal-03329116/file/trang21_interspeech.pdf https://hal.archives-ouvertes.fr/hal-03329116 https://hal.archives-ouvertes.fr/hal-03329116/document https://doi.org/10.21437/interspeech.2021-125
|
|
BASE
|
|
Hide details
|
|
4 |
The Role of the Auditory and Visual Modalities in the Perceptual Identification of Brazilian Portuguese Statements and Echo Questions
|
|
|
|
In: ISSN: 0023-8309 ; Language and Speech ; https://hal.archives-ouvertes.fr/hal-02456308 ; Language and Speech, SAGE Publications (UK and US), 2021, 64 (1), pp.3-23. ⟨10.1177/0023830919898886⟩ ; https://journals.sagepub.com/doi/pdf/10.1177/0023830919898886 (2021)
|
|
BASE
|
|
Show details
|
|
5 |
Linguistic varieties in Brazil and beyond
|
|
|
|
In: ISSN: 1980-2552 ; Diadorim - Revista cientifica do programa de pos-graduaçào em letras vernàculas ; https://hal.archives-ouvertes.fr/hal-03512410 ; Diadorim - Revista cientifica do programa de pos-graduaçào em letras vernàculas, Université fédérale de Rio de Janeiro, 2021, Dossiê Língua e Literatura, 23 (1), pp.24-33. ⟨10.35520/diadorim.2021.v23n1a44441⟩ ; https://revistas.ufrj.br/index.php/diadorim/article/view/44441 (2021)
|
|
BASE
|
|
Show details
|
|
6 |
Visual channel influences the accuracy of listeners’ comprehension of Brazilian Portuguese intonation of wh-questions and wh-exclamations
|
|
|
|
In: 4TH PHONETICS AND PHONOLOGY IN EUROPE ; https://hal.archives-ouvertes.fr/hal-03512423 ; 4TH PHONETICS AND PHONOLOGY IN EUROPE, Jun 2021, Barcelona, Spain. pp.301-303 ; https://pape2021.upf.edu/ (2021)
|
|
BASE
|
|
Show details
|
|
7 |
Medida objetiva da variação prosódica entre línguas Românicas da França
|
|
|
|
In: XIX CONGRESSO INTERNACIONAL DA ASSOCIAÇÃO DE LINGUÍSTICA E FILOLOGIA DA AMÉRICA LATINA ; https://hal.archives-ouvertes.fr/hal-03511305 ; XIX CONGRESSO INTERNACIONAL DA ASSOCIAÇÃO DE LINGUÍSTICA E FILOLOGIA DA AMÉRICA LATINA, josé Mendoza; Dermeval da Hora Oliveira; Miguel Oliveira Júnior, Aug 2021, Online, Bolivia ; https://www.mundoalfal.org/es/pt_ultimo_congreso (2021)
|
|
BASE
|
|
Show details
|
|
8 |
A prosódia como marca formal para os Marcadores Discursivos e suas funções
|
|
|
|
In: XIX CONGRESSO INTERNACIONAL DA ASSOCIAÇÃO DE LINGUÍSTICA E FILOLOGIA DA AMÉRICA LATINA ; https://hal.archives-ouvertes.fr/hal-03511307 ; XIX CONGRESSO INTERNACIONAL DA ASSOCIAÇÃO DE LINGUÍSTICA E FILOLOGIA DA AMÉRICA LATINA, josé Mendoza; Dermeval da Hora Oliveira; Miguel Oliveira Júnior, Aug 2021, Online, Bolivia ; https://www.mundoalfal.org/es/pt_ultimo_congreso (2021)
|
|
BASE
|
|
Show details
|
|
9 |
Variedades linguísticas dentro e fora do Brasil
|
|
|
|
In: ISSN: 1980-2552 ; Diadorim - Revista cientifica do programa de pos-graduaçào em letras vernàculas ; https://hal.archives-ouvertes.fr/hal-03512413 ; Diadorim - Revista cientifica do programa de pos-graduaçào em letras vernàculas, Université fédérale de Rio de Janeiro, 2021, Dossiê Língua e Literatura, 23 (1), pp.14-23 ; https://revistas.ufrj.br/index.php/diadorim/article/view/44438 (2021)
|
|
BASE
|
|
Show details
|
|
10 |
O PAPEL DOS GESTOS FACIAIS NA PERCEPÇÃO DA ENTOAÇÃO DA ASSERÇÃO E DA QUESTÃO-ECO EM ÁUDIO LIMPO E DEGRADADO
|
|
|
|
In: ISBN 978-85-60453-54-2 ; Intercâmbio de Pesquisas em Linguística Aplicada ; https://hal.archives-ouvertes.fr/hal-03512419 ; Intercâmbio de Pesquisas em Linguística Aplicada, Pontifícia Universidade Católica de São Paulo, Nov 2021, São Paulo, Brazil (2021)
|
|
BASE
|
|
Show details
|
|
11 |
Prosodic speech acts: between acoustic codes and linguistic conventionalization
|
|
|
|
In: XIX CONGRESSO INTERNACIONAL DA ASSOCIAÇÃO DE LINGUÍSTICA E FILOLOGIA DA AMÉRICA LATINA ; https://hal.archives-ouvertes.fr/hal-03512418 ; XIX CONGRESSO INTERNACIONAL DA ASSOCIAÇÃO DE LINGUÍSTICA E FILOLOGIA DA AMÉRICA LATINA, josé Mendoza; Dermeval da Hora Oliveira, Aug 2021, Online, Bolivia ; https://www.mundoalfal.org/es/pt_ultimo_congreso (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Can the prosody of statements and various types of questions be identified in Gallo-Romance dialects?
|
|
|
|
In: 1st International Conference of Tone and Intonation (TAI 2021) ; https://hal.archives-ouvertes.fr/hal-03512401 ; 1st International Conference of Tone and Intonation (TAI 2021), Dec 2021, Sønderborg, Denmark ; https://event.sdu.dk/tai2021 (2021)
|
|
BASE
|
|
Show details
|
|
13 |
Percepção audiovisual da entoação modal do português do Brasil
|
|
|
|
In: Gradus - Revista Brasileira de Fonologia de Laboratório ; https://hal.archives-ouvertes.fr/hal-03094466 ; Gradus - Revista Brasileira de Fonologia de Laboratório, 2020, 5 (1), pp.47-70. ⟨10.47627/gradus.v5i1.148⟩ (2020)
|
|
BASE
|
|
Show details
|
|
14 |
The combined Perception of Socio-affective Prosody: Cultural Differences in Pattern Matching
|
|
|
|
In: ISSN: 1342-8675 ; The Journal of the Phonetic Society of Japan ; https://hal.archives-ouvertes.fr/hal-03098638 ; The Journal of the Phonetic Society of Japan, The Phonetic Society of Japan, 2020, 24, pp.84-96. ⟨10.24467/onseikenkyu.24.0_84⟩ ; https://www.jstage.jst.go.jp/article/onseikenkyu/24/0/24_84/_article/-char/ja/ (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Perception of Audio-visual Expressions in German and Cantonese by Native Speakers of Hindi
|
|
|
|
In: 10th International Conference on Speech Prosody 2020 ; https://hal.archives-ouvertes.fr/hal-03094428 ; 10th International Conference on Speech Prosody 2020, May 2020, Tokyo, Japan. pp.31-35, ⟨10.21437/SpeechProsody.2020-7⟩ (2020)
|
|
BASE
|
|
Show details
|
|
16 |
Statistical modeling of prosodic contours of four speech acts in Brazilian Portuguese
|
|
|
|
In: 10th International Conference on Speech Prosody 2020 ; https://hal.archives-ouvertes.fr/hal-03094488 ; 10th International Conference on Speech Prosody 2020, May 2020, Tokyo, Japan. pp.404-408, ⟨10.21437/SpeechProsody.2020-83⟩ (2020)
|
|
BASE
|
|
Show details
|
|
17 |
Cross cultural differences in arousal and valence perceptions of voice quality
|
|
|
|
In: 10th International Conference on Speech Prosody 2020 ; https://hal.archives-ouvertes.fr/hal-03094524 ; 10th International Conference on Speech Prosody 2020, May 2020, Tokyo, Japan. pp.720-724, ⟨10.21437/SpeechProsody.2020-147⟩ (2020)
|
|
BASE
|
|
Show details
|
|
18 |
VISUAL AND AUDITORY CUES OF ASSERTIONS AND QUESTIONS IN BRAZILIAN PORTUGUESE AND MEXICAN SPANISH: A COMPARATIVE STUDY
|
|
|
|
In: ISSN: 2236-9740 ; Journal of Speech Sciences ; https://hal.archives-ouvertes.fr/hal-03012006 ; Journal of Speech Sciences, Journal of Speech Sciences, 2020, pp.73 - 92. ⟨10.20396/joss.v9i00.14958⟩ ; http://revistas.iel.unicamp.br/ojs_joss/index.php/journalofspeechsciences/article/view/185 (2020)
|
|
BASE
|
|
Show details
|
|
19 |
Fala e multimodalidade
|
|
|
|
In: Verbetes LBASS ; https://hal.archives-ouvertes.fr/hal-03511326 ; Verbetes LBASS, 2020, http://www.letras.ufmg.br/padrao_cms/index.php?web=lbass&lang=1&page=3619&menu=&tipo=1 (2020)
|
|
BASE
|
|
Show details
|
|
20 |
The perception of prosodic cues in Brazilian Portuguese statements and echo-questions: analysis by resynthesis
|
|
|
|
In: Phonetics and Phonology in Europe Conference ; https://hal.archives-ouvertes.fr/hal-02425674 ; Phonetics and Phonology in Europe Conference, Jun 2019, Lecce, Italy (2019)
|
|
BASE
|
|
Show details
|
|
|
|