Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 17 of 17

1	A Complementary Joint Training Approach Using Unpaired Speech and Text for Low-Resource Automatic Speech Recognition ...
	Du, Ye-Qian; Zhang, Jie; Zhu, Qiu-Shi. - : arXiv, 2022
	BASE
	Show details

2	XLST: Cross-lingual Self-training to Learn Multilingual Representation for Low Resource Speech Recognition ...
	Zhang, Zi-Qiang; Song, Yan; Wu, Ming-Hui. - : arXiv, 2021
	BASE
	Show details

3	Recognition-Synthesis Based Non-Parallel Voice Conversion with Adversarial Learning ...
	Zhang, Jing-Xuan; Ling, Zhen-Hua; Dai, Li-Rong. - : arXiv, 2020
	BASE
	Show details

4	Improving Sequence-to-Sequence Acoustic Modeling by Adding Text-Supervision ...
	Zhang, Jing-Xuan; Ling, Zhen-Hua; Jiang, Yuan. - : arXiv, 2018
	BASE
	Show details

5	LID-senones and their statistics for language identification
	Jin, Ma; Song, Yan; McLoughlin, Ian Vince. - : Institute of Electrical and Electronics Engineers, 2017
	BASE
	Show details

6	A human neurodevelopmental model for Williams syndrome.
	Chailangkarn, Thanathom; Trujillo, Cleber A; Freitas, Beatriz C...
	In: Nature, vol 536, iss 7616 (2016)
	BASE
	Show details

7	A human neurodevelopmental model for Williams syndrome.
	Chailangkarn, Thanathom; Trujillo, Cleber A; Freitas, Beatriz C...
	In: Nature, vol 536, iss 7616 (2016)
	BASE
	Show details

8	Improvements on Deep Bottleneck Network based I-Vector Representation for Spoken Language Identification
	Song, Yan; Cui, Ruilian; McLoughlin, Ian Vince. - 2016
	BASE
	Show details

9	Deep Bottleneck Feature for Image Classification
	Song, Yan; McLoughlin, Ian Vince; Dai, Li-Rong. - 2015
	BASE
	Show details

10	HMM-based unit selection speech synthesis using log likelihood ratios derived from perceptual data
	Xia, Xian-Jun; Ling, Zhen-Hua; Jiang, Yuan...
	In: Speech communication. - Amsterdam [u.a.] : Elsevier 63 (2014), 27-37
	OLC Linguistik
	Show details

11	Deep Bottleneck Features for Spoken Language Identification
	Jiang, Bing; Song, Yan; Wei, Si. - : Public Library of Science, 2014
	BASE
	Show details

12	Whisper-to-speech conversion using restricted Boltzmann machine arrays
	Li, Jing-jie; McLoughlin, Ian Vince; Dai, Li-Rong. - : IET Digital Library, 2014
	BASE
	Show details

13	Deep bottleneck features for spoken language identification
	Jiang, Bing; Song, Yan; Wei, Si; Liu, Jun-Hua; McLoughlin, Ian Vince; Dai, Li-Rong. - : Public Library of Science, 2014
	Abstract: A key problem in spoken language identification (LID) is to design effective representations which are specific to language information. For example, in recent years, representations based on both phonotactic and acoustic features have proven their effectiveness for LID. Although advances in machine learning have led to significant improvements, LID performance is still lacking, especially for short duration speech utterances. With the hypothesis that language information is weak and represented only latently in speech, and is largely dependent on the statistical properties of the speech content, existing representations may be insufficient. Furthermore they may be susceptible to the variations caused by different speakers, specific content of the speech segments, and background noise. To address this, we propose using Deep Bottleneck Features (DBF) for spoken LID, motivated by the success of Deep Neural Networks (DNN) in speech recognition. We show that DBFs can form a low-dimensional compact representation of the original inputs with a powerful descriptive and discriminative capability. To evaluate the effectiveness of this, we design two acoustic models, termed DBF-TV and parallel DBF-TV (PDBF-TV), using a DBF based i-vector representation for each speech utterance. Results on NIST language recognition evaluation 2009 (LRE09) show significant improvements over state-of-the-art systems. By fusing the output of phonotactic and acoustic approaches, we achieve an EER of 1.08%, 1.89% and 7.01% for 30 s, 10 s and 3 s test utterances respectively. Furthermore, various DBF configurations have been extensively evaluated, and an optimal system proposed.
	Keyword: T Technology
	URL: https://doi.org/10.1371/journal.pone.0100795 https://kar.kent.ac.uk/48803/
	BASE
	Hide details

14	Minimum Kullback-Leibler divergence parameter generation for HMM-based speech synthesis
	Ling, Zhen-Hua; Dai, Li-Rong
	In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 20 (2012) 5, 1492-1502
	BLLDB
	OLC Linguistik
	Show details

15	Trust region-based optimization for maximum mutual information estimation of HMMs in speech recognition
	Hu, Yu; Jiang, Hui; Liu, Cong...
	In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 19 (2011) 8, 2474-2485
	BLLDB
	OLC Linguistik
	Show details

16	Intelligence in Williams Syndrome is related to STX1A, which encodes a component of the presynaptic SNARE complex.
	Gao, Michael C; Bellugi, Ursula; Dai, Li...
	In: PloS one, vol 5, iss 4 (2010)
	BASE
	Show details

17	Intelligence in Williams Syndrome Is Related to STX1A, Which Encodes a Component of the Presynaptic SNARE Complex
	Gao, Michael C.; Bellugi, Ursula; Dai, Li. - : Public Library of Science, 2010
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern