Home Catalogue search

eng

Refine your search:
- Keyword
- Creator / Publisher
- Year:
  - 2022 (1)
  - 2021 (10)
  - 2020 (1)
  - 2019 (4)
  - 2018 (2)
  - 2017 (2)
  - 2015 (1)
  - 2014 (5)
  - 2013 (7)
  - 2012 (1)
  - more
- Medium:
  - Online (76)
- Type:
- BLLDB-Access:
  - free (76)
  - subject to license (0)

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4

Hits 1 – 20 of 76

1	Evaluation of Tacotron Based Synthesizers for Spanish and Basque
	Víctor García; Inma Hernáez; Eva Navas
	In: Applied Sciences; Volume 12; Issue 3; Pages: 1686 (2022)
	BASE
	Show details

2	Prosodic Boundary Prediction Model for Vietnamese Text-To-Speech
	Trang, Nguyen Thi Thu; Ky, Nguyen,; Rilliard, Albert...
	In: Proc. Interspeech 2021 ; Interspeech 2021 ; https://hal.archives-ouvertes.fr/hal-03329116 ; Interspeech 2021, Aug 2021, Brno, Czech Republic. pp.3885-3889, ⟨10.21437/interspeech.2021-125⟩ (2021)
	BASE
	Show details

3	WaveRNN checkpoint ...
	Sir.Ai. - : Zenodo, 2021
	BASE
	Show details

4	Neural Vocoder Checkpoint Collections ...
	SIR.AI. - : Zenodo, 2021
	BASE
	Show details

5	Neural Vocoder Checkpoint Collections ...
	SIR.AI. - : Zenodo, 2021
	BASE
	Show details

6	Multilingual TTS Cloning model ...
	Sir.Ai. - : Zenodo, 2021
	BASE
	Show details

7	Neural Vocoder Checkpoint Collections ...
	SIR.AI. - : Zenodo, 2021
	BASE
	Show details

8	Multilingual TTS Cloning model ...
	Sir.Ai. - : Zenodo, 2021
	BASE
	Show details

9	Ressources for End-to-End French Text-to-Speech Blizzard challenge ...
	Bailly, Gérard; Perrotin, Olivier; Lenglet, Martin. - : Zenodo, 2021
	BASE
	Show details

10	Ressources for End-to-End French Text-to-Speech Blizzard challenge ...
	Bailly, Gérard; Perrotin, Olivier; Lenglet, Martin. - : Zenodo, 2021
	BASE
	Show details

11	Korean Prosody Phrase Boundary Prediction Model for Speech Synthesis Service in Smart Healthcare
	Minho Kim; Youngim Jung; Hyuk-Chul Kwon
	In: Electronics ; Volume 10 ; Issue 19 (2021)
	BASE
	Show details

12	Database of speech corpora of Czech laryngectomy patients
	Matoušek, Jindřich; Tihelka, Daniel; Jůzová, Markéta. - : University of West Bohemia, Department of Cybernetics, 2020
	BASE
	Show details

13	Text-to-Speech Synthesis Using Found Data for Low-Resource Languages
	Cooper, Erica Lindsay. - 2019
	Abstract: Text-to-speech synthesis is a key component of interactive, speech-based systems. Typically, building a high-quality voice requires collecting dozens of hours of speech from a single professional speaker in an anechoic chamber with a high-quality microphone. There are about 7,000 languages spoken in the world, and most do not enjoy the speech research attention historically paid to such languages as English, Spanish, Mandarin, and Japanese. Speakers of these so-called "low-resource languages" therefore do not equally benefit from these technological advances. While it takes a great deal of time and resources to collect a traditional text-to-speech corpus for a given language, we may instead be able to make use of various sources of "found'' data which may be available. In particular, sources such as radio broadcast news and ASR corpora are available for many languages. While this kind of data does not exactly match what one would collect for a more standard TTS corpus, it may nevertheless contain parts which are usable for producing natural and intelligible parametric TTS voices. In the first part of this thesis, we examine various types of found speech data in comparison with data collected for TTS, in terms of a variety of acoustic and prosodic features. We find that radio broadcast news in particular is a good match. Audiobooks may also be a good match despite their largely more expressive style, and certain speakers in conversational and read ASR corpora also resemble TTS speakers in their manner of speaking and thus their data may be usable for training TTS voices. In the rest of the thesis, we conduct a variety of experiments in training voices on non-traditional sources of data, such as ASR data, radio broadcast news, and audiobooks. We aim to discover which methods produce the most intelligible and natural-sounding voices, focusing on three main approaches: 1) Training data subset selection. In noisy, heterogeneous data sources, we may wish to locate subsets of the data that are well-suited for building voices, based on acoustic and prosodic features that are known to correspond with TTS-style speech, while excluding utterances that introduce noise or other artifacts. We find that choosing subsets of speakers for training data can result in voices that are more intelligible. 2) Augmenting the frontend feature set with new features. In cleaner sources of found data, we may wish to train voices on all of the data, but we may get improvements in naturalness by including acoustic and prosodic features at the frontend and synthesizing in a manner that better matches the TTS style. We find that this approach is promising for creating more natural-sounding voices, regardless of the underlying acoustic model. 3) Adaptation. Another way to make use of high-quality data while also including informative acoustic and prosodic features is to adapt to subsets, rather than to select and train only on subsets. We also experiment with training on mixed high- and low-quality data, and adapting towards the high-quality set, which produces more intelligible voices than training on either type of data by itself. We hope that our findings may serve as guidelines for anyone wishing to build their own TTS voice using non-traditional sources of found data.
	Keyword: Computer science; Speech synthesis; Text-to-speech software
	URL: https://doi.org/10.7916/d8-vdzp-j870
	BASE
	Hide details

14	Exploring Efficient Neural Architectures for Linguistic–Acoustic Mapping in Text-To-Speech
	Santiago Pascual; Joan Serrà; Antonio Bonafonte
	In: Applied Sciences ; Volume 9 ; Issue 16 (2019)
	BASE
	Show details

15	An Sc?ala?: autonomous learners harnessing speech and language technologies ; SLaTE 2019: 8th ISCA Workshop on Speech and Language Technology in Education
	N? Chiar?in, Neasa; N? Chasaide, Ailbhe. - : ISCA, 2019
	BASE
	Show details

16	Towards Machine Speech-to-speech Translation
	Nakamura, Satoshi; Sudoh, Katsuhito; Sakti, Sakriani. - 2019
	BASE
	Show details

17	Speech Synthesis Using Syllable For Marathi Language ...
	Pravin M Ghate*1 & S.D.Shirbhadurkar2. - : Zenodo, 2018
	BASE
	Show details

18	Speech Synthesis Using Syllable For Marathi Language ...
	Pravin M Ghate*1 & S.D.Shirbhadurkar2. - : Zenodo, 2018
	BASE
	Show details

19	Statistical parametric speech synthesis using conversational data and phenomena
	Dall, Rasmus. - : The University of Edinburgh, 2017
	BASE
	Show details

20	Varieeruva vältega sõnade hääldusuuringud kõnesünteesi teenistuses
	Liisi Piits; Mari-Liis Kalvik
	In: Eesti Rakenduslingvistika Ühingu Aastaraamat, Vol 13, Pp 123-140 (2017) (2017)
	BASE
	Show details

Page: 1 2 3 4

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern