DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4
Hits 1 – 20 of 76

1
Evaluation of Tacotron Based Synthesizers for Spanish and Basque
In: Applied Sciences; Volume 12; Issue 3; Pages: 1686 (2022)
BASE
Show details
2
Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
BASE
Show details
3
WaveRNN checkpoint ...
Sir.Ai. - : Zenodo, 2021
BASE
Show details
4
Neural Vocoder Checkpoint Collections ...
SIR.AI. - : Zenodo, 2021
BASE
Show details
5
Neural Vocoder Checkpoint Collections ...
SIR.AI. - : Zenodo, 2021
BASE
Show details
6
Multilingual TTS Cloning model ...
Sir.Ai. - : Zenodo, 2021
BASE
Show details
7
Neural Vocoder Checkpoint Collections ...
SIR.AI. - : Zenodo, 2021
BASE
Show details
8
Multilingual TTS Cloning model ...
Sir.Ai. - : Zenodo, 2021
BASE
Show details
9
Ressources for End-to-End French Text-to-Speech Blizzard challenge ...
BASE
Show details
10
Ressources for End-to-End French Text-to-Speech Blizzard challenge ...
BASE
Show details
11
Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare
In: Electronics ; Volume 10 ; Issue 19 (2021)
BASE
Show details
12
Database of speech corpora of Czech laryngectomy patients
Matoušek, Jindřich; Tihelka, Daniel; Jůzová, Markéta. - : University of West Bohemia, Department of Cybernetics, 2020
BASE
Show details
13
Text-to-Speech Synthesis Using Found Data for Low-Resource Languages
Abstract: Text-to-speech synthesis is a key component of interactive, speech-based systems. Typically, building a high-quality voice requires collecting dozens of hours of speech from a single professional speaker in an anechoic chamber with a high-quality microphone. There are about 7,000 languages spoken in the world, and most do not enjoy the speech research attention historically paid to such languages as English, Spanish, Mandarin, and Japanese. Speakers of these so-called "low-resource languages" therefore do not equally benefit from these technological advances. While it takes a great deal of time and resources to collect a traditional text-to-speech corpus for a given language, we may instead be able to make use of various sources of "found'' data which may be available. In particular, sources such as radio broadcast news and ASR corpora are available for many languages. While this kind of data does not exactly match what one would collect for a more standard TTS corpus, it may nevertheless contain parts which are usable for producing natural and intelligible parametric TTS voices. In the first part of this thesis, we examine various types of found speech data in comparison with data collected for TTS, in terms of a variety of acoustic and prosodic features. We find that radio broadcast news in particular is a good match. Audiobooks may also be a good match despite their largely more expressive style, and certain speakers in conversational and read ASR corpora also resemble TTS speakers in their manner of speaking and thus their data may be usable for training TTS voices. In the rest of the thesis, we conduct a variety of experiments in training voices on non-traditional sources of data, such as ASR data, radio broadcast news, and audiobooks. We aim to discover which methods produce the most intelligible and natural-sounding voices, focusing on three main approaches: 1) Training data subset selection. In noisy, heterogeneous data sources, we may wish to locate subsets of the data that are well-suited for building voices, based on acoustic and prosodic features that are known to correspond with TTS-style speech, while excluding utterances that introduce noise or other artifacts. We find that choosing subsets of speakers for training data can result in voices that are more intelligible. 2) Augmenting the frontend feature set with new features. In cleaner sources of found data, we may wish to train voices on all of the data, but we may get improvements in naturalness by including acoustic and prosodic features at the frontend and synthesizing in a manner that better matches the TTS style. We find that this approach is promising for creating more natural-sounding voices, regardless of the underlying acoustic model. 3) Adaptation. Another way to make use of high-quality data while also including informative acoustic and prosodic features is to adapt to subsets, rather than to select and train only on subsets. We also experiment with training on mixed high- and low-quality data, and adapting towards the high-quality set, which produces more intelligible voices than training on either type of data by itself. We hope that our findings may serve as guidelines for anyone wishing to build their own TTS voice using non-traditional sources of found data.
Keyword: Computer science; Speech synthesis; Text-to-speech software
URL: https://doi.org/10.7916/d8-vdzp-j870
BASE
Hide details
14
Exploring Efficient Neural Architectures for Linguistic–Acoustic Mapping in Text-To-Speech
In: Applied Sciences ; Volume 9 ; Issue 16 (2019)
BASE
Show details
15
An Sc?ala?: autonomous learners harnessing speech and language technologies ; SLaTE 2019: 8th ISCA Workshop on Speech and Language Technology in Education
BASE
Show details
16
Towards Machine Speech-to-speech Translation
BASE
Show details
17
Speech Synthesis Using Syllable For Marathi Language ...
BASE
Show details
18
Speech Synthesis Using Syllable For Marathi Language ...
BASE
Show details
19
Statistical parametric speech synthesis using conversational data and phenomena
Dall, Rasmus. - : The University of Edinburgh, 2017
BASE
Show details
20
Varieeruva vältega sõnade hääldusuuringud kõnesünteesi teenistuses
In: Eesti Rakenduslingvistika Ühingu Aastaraamat, Vol 13, Pp 123-140 (2017) (2017)
BASE
Show details

Page: 1 2 3 4

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
76
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern