Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	Improving Data Selection for Low Resource STT and KWS
	Fraga-Silva, Thiago; Laurent,Antoine; Gauvain,Jean-Luc; Lamel,Lori; Le,Viet-Bac; Messaoudi,Abdel. - 2016
	Abstract: This paper extends recent research on training data selection for speech transcription and keyword spotting system development. Selection techniques were explored in the context of the IARPA-Babel Active Learning (AL) task for 6 languages. Different selection criteria were considered with the goal of improving over a system built using a pre-defined 3-hour training data set. Four variants of the entropy-based criterion were explored: words, triphones, phones as well as the use of HMM-states previously introduced in [4]. The influence of the number of HMM-states was assessed as well as whether automatic or manual reference transcripts were used. The combination of selection criteria was investigated, and a novel multi-stage selection method proposed. This method was also assessed using larger data sets than were permitted in the Babel AL task. Results are reported for the 6 languages. The multi-stage selection was also applied to the surprise language (Swahili) in the NIST OpenKWS 2015 evaluation. ; 2015 IEEE Workshop on Automatic Speech Recognition and Understanding (ASRU) , 13 Dec 2015, 17 Dec 2015
	Keyword: acoustics; data selection; decoding; entropy; Hidden Markov models; IARPA Collection; keyword spotting; low-resource languages; speech; speech recognition; training; training data
	URL: http://www.dtic.mil/docs/citations/AD1038536 http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=AD1038536
	BASE
	Hide details

2	Machine Translation Based Data Augmentation for Cantonese Keyword Spotting (Author's Manuscript)
	Huang, Guangpu; Gorin,Arseniy; Gauvain,Jean-Luc. - 2016
	BASE
	Show details

3	Investigating Techniques for Low Resource Conversational Speech Recognition
	Laurent, Antoine; Fraga-Silva,Thiago; Lamel,Lori. - 2016
	BASE
	Show details

Search in the Catalogues and Directories