Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	End-to-end ASR to jointly predict transcriptions and linguistic annotations ...
	NAACL 2021 2021; Fujita, Yuya; Omachi, Motoi. - : Underline Science Inc., 2021
	BASE
	Show details

2	Speech Representation Learning Combining Conformer CPC with Deep Cluster for the ZeroSpeech Challenge 2021 ...
	Maekaku, Takashi; Chang, Xuankai; Fujita, Yuya; Chen, Li-Wei; Watanabe, Shinji; Rudnicky, Alexander. - : arXiv, 2021
	Abstract: We present a system for the Zero Resource Speech Challenge 2021, which combines a Contrastive Predictive Coding (CPC) with deep cluster. In deep cluster, we first prepare pseudo-labels obtained by clustering the outputs of a CPC network with k-means. Then, we train an additional autoregressive model to classify the previously obtained pseudo-labels in a supervised manner. Phoneme discriminative representation is achieved by executing the second-round clustering with the outputs of the final layer of the autoregressive model. We show that replacing a Transformer layer with a Conformer layer leads to a further gain in a lexical metric. Experimental results show that a relative improvement of 35% in a phonetic metric, 1.5% in the lexical metric, and 2.3% in a syntactic metric are achieved compared to a baseline method of CPC-small which is trained on LibriSpeech 460h data. We achieve top results in this challenge with the syntactic metric. ...
	Keyword: Audio and Speech Processing eess.AS; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
	URL: https://dx.doi.org/10.48550/arxiv.2107.05899 https://arxiv.org/abs/2107.05899
	BASE
	Hide details

Search in the Catalogues and Directories