Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6 7 8...100

Hits 61 – 80 of 1.986

61	Filter-based Discriminative Autoencoders for Children Speech Recognition ...
	Tai, Chiang-Lin; Lee, Hung-Shin; Tsao, Yu. - : arXiv, 2022
	BASE
	Show details

62	Transducer-based language embedding for spoken language identification ...
	Shen, Peng; Lu, Xugang; Kawai, Hisashi. - : arXiv, 2022
	BASE
	Show details

63	Detecting Dysfluencies in Stuttering Therapy Using wav2vec 2.0 ...
	Bayerl, Sebastian P.; Wagner, Dominik; Nöth, Elmar. - : arXiv, 2022
	BASE
	Show details

64	Multi-sequence Intermediate Conditioning for CTC-based ASR ...
	Fujita, Yusuke; Komatsu, Tatsuya; Kida, Yusuke. - : arXiv, 2022
	BASE
	Show details

65	Code Switched and Code Mixed Speech Recognition for Indic languages ...
	Chadha, Harveen Singh; Shah, Priyanshi; Dhuriya, Ankur. - : arXiv, 2022
	BASE
	Show details

66	Simple and Effective Unsupervised Speech Synthesis ...
	Liu, Alexander H.; Lai, Cheng-I Jeff; Hsu, Wei-Ning. - : arXiv, 2022
	BASE
	Show details

67	Multistream neural architectures for cued-speech recognition using a pre-trained visual feature extractor and constrained CTC decoding ...
	Sankar, Sanjana; Beautemps, Denis; Hueber, Thomas. - : arXiv, 2022
	BASE
	Show details

68	Applying Feature Underspecified Lexicon Phonological Features in Multilingual Text-to-Speech ...
	Zhang, Cong; Zeng, Huinan; Liu, Huang. - : arXiv, 2022
	BASE
	Show details

69	MAESTRO: Matched Speech Text Representations through Modality Matching ...
	Chen, Zhehuai; Zhang, Yu; Rosenberg, Andrew; Ramabhadran, Bhuvana; Moreno, Pedro; Bapna, Ankur; Zen, Heiga. - : arXiv, 2022
	Abstract: We present Maestro, a self-supervised training method to unify representations learnt from speech and text modalities. Self-supervised learning from speech signals aims to learn the latent structure inherent in the signal, while self-supervised learning from text attempts to capture lexical information. Learning aligned representations from unpaired speech and text sequences is a challenging task. Previous work either implicitly enforced the representations learnt from these two modalities to be aligned in the latent space through multitasking and parameter sharing or explicitly through conversion of modalities via speech synthesis. While the former suffers from interference between the two modalities, the latter introduces additional complexity. In this paper, we propose Maestro, a novel algorithm to learn unified representations from both these modalities simultaneously that can transfer to diverse downstream tasks such as Automated Speech Recognition (ASR) and Speech Translation (ST). Maestro learns ... : Submitted to Interspeech 2022 ...
	Keyword: 68T10; Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; I.2.7; Sound cs.SD
	URL: https://arxiv.org/abs/2204.03409 https://dx.doi.org/10.48550/arxiv.2204.03409
	BASE
	Hide details

70	Improving Language Identification of Accented Speech ...
	Kukk, Kunnar; Alumäe, Tanel. - : arXiv, 2022
	BASE
	Show details

71	Cross-stitched Multi-modal Encoders ...
	Singla, Karan; Pressel, Daniel; Price, Ryan. - : arXiv, 2022
	BASE
	Show details

72	Effect and Analysis of Large-scale Language Model Rescoring on Competitive ASR Systems ...
	Udagawa, Takuma; Suzuki, Masayuki; Kurata, Gakuto. - : arXiv, 2022
	BASE
	Show details

73	UK-South Korea Prosody Research Network ...
	Jeon, Hae-Sung. - : Open Science Framework, 2022
	BASE
	Show details

74	Speaker Extraction with Co-Speech Gestures Cue ...
	Pan, Zexu; Qian, Xinyuan; Li, Haizhou. - : arXiv, 2022
	BASE
	Show details

75	Lombard Effect for Bilingual Speakers in Cantonese and English: importance of spectro-temporal features ...
	Scharf, Maximilian Karl; Hochmuth, Sabine; Wong, Lena L. N.. - : arXiv, 2022
	BASE
	Show details

76	Cochlear Implant Results in Older Adults with Post-Lingual Deafness: The Role of “Top-Down” Neurocognitive Mechanisms
	Milena Zucca; Andrea Albera; Roberto Albera; Carla Montuschi; Beatrice Della Gatta; Andrea Canale; Innocenzo Rainero
	In: International Journal of Environmental Research and Public Health; Volume 19; Issue 3; Pages: 1343 (2022)
	BASE
	Show details

77	MLLP-VRAIN Spanish ASR Systems for the Albayzín-RTVE 2020 Speech-to-Text Challenge: Extension
	Pau Baquero-Arnal; Javier Jorge; Adrià Giménez; Javier Iranzo-Sánchez; Alejandro Pérez; Gonçal Vicent Garcés Díaz-Munío; Joan Albert Silvestre-Cerdà; Jorge Civera; Albert Sanchis; Alfons Juan
	In: Applied Sciences; Volume 12; Issue 2; Pages: 804 (2022)
	BASE
	Show details

78	On the Difference of Scoring in Speech in Babble Tests
	Afroditi Sereti; Christos Sidiras; Nikos Eleftheriadis; Ioannis Nimatoudis; Gail D. Chermak; Vasiliki Maria Iliadou
	In: Healthcare; Volume 10; Issue 3; Pages: 458 (2022)
	BASE
	Show details

79	An Empirical Performance Analysis of the Speak Correct Computerized Interface
	Kamal Jambi; Hassanin Al-Barhamtoshy; Wajdi Al-Jedaibi; Mohsen Rashwan; Sherif Abdou
	In: Processes; Volume 10; Issue 3; Pages: 487 (2022)
	BASE
	Show details

80	DeepFry: Identifying Vocal Fry Using Deep Neural Networks ...
	Chernyak, Bronya R.; Simon, Talia Ben; Segal, Yael. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5 6 7 8...100

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern