Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3

Hits 1 – 20 of 47

1	First DIHARD Challenge -- System Submissions and Scores ...
	Ryant, Neville; Church, Kenneth; Cieri, Christopher. - : Zenodo, 2021
	BASE
	Show details

2	First DIHARD Challenge -- System Submissions and Scores ...
	Ryant, Neville; Church, Kenneth; Cieri, Christopher. - : Zenodo, 2021
	BASE
	Show details

3	Adaptations in Speech Processing
	Xu, Jue. - : Humboldt-Universität zu Berlin, 2021
	BASE
	Show details

4	Learning speech embeddings for speaker adaptation and speech understanding
	Sari, Leda. - 2021
	Abstract: In recent years, deep neural network models gained popularity as a modeling approach for many speech processing tasks including automatic speech recognition (ASR) and spoken language understanding (SLU). In this dissertation, there are two main goals. The first goal is to propose modeling approaches in order to learn speaker embeddings for speaker adaptation or to learn semantic speech embeddings. The second goal is to introduce training objectives that achieve fairness for the ASR and SLU problems. In the case of speaker adaptation, we introduce an auxiliary network to an ASR model and learn to simultaneously detect speaker changes and adapt to the speaker in an unsupervised way. We show that this joint model leads to lower error rates as compared to a two-step approach where the signal is segmented into single speaker regions and then fed into an adaptation model. We then reformulate the speaker adaptation problem from a counterfactual fairness point-of-view and introduce objective functions to match the ASR performance of the individuals in the dataset to that of their counterfactual counterparts. We show that we can achieve lower error rate in an ASR system while reducing the performance disparity between protected groups. In the second half of the dissertation, we focus on SLU and tackle two problems associated with SLU datasets. The first SLU problem is the lack of large speech corpora. To handle this issue, we propose to use available non-parallel text data so that we can leverage the information in text to guide learning of the speech embeddings. We show that this technique increases the intent classification accuracy as compared to a speech-only system. The second SLU problem is the label imbalance problem in the datasets, which is also related to fairness since a model trained on skewed data usually leads to biased results. To achieve fair SLU, we propose to maximize the F-measure instead of conventional cross-entropy minimization and show that it is possible to increase the number of classes with nonzero recall. In the last two chapters, we provide additional discussions on the impact of these projects from both technical and social perspectives, propose directions for future research and summarize the findings.
	Keyword: automatic speech recognition; fairness in machine learning; Neural networks; speaker adaptation; spoken language understanding
	URL: http://hdl.handle.net/2142/110438
	BASE
	Hide details

5	Towards unsupervised learning of speech features in the wild
	Rivière, Morgane; Dupoux, Emmanuel
	In: SLT 2020 : IEEE Spoken Language Technology Workshop ; https://hal.archives-ouvertes.fr/hal-03070411 ; SLT 2020 : IEEE Spoken Language Technology Workshop, Dec 2020, Shenzhen / Virtual, China (2020)
	BASE
	Show details

6	Achieving Multi-Accent ASR via Unsupervised Acoustic Model Adaptation
	Turan, Mehmet Ali Tuğtekin; Vincent, Emmanuel; Jouvet, Denis
	In: INTERSPEECH 2020 ; https://hal.inria.fr/hal-02907929 ; INTERSPEECH 2020, Oct 2020, Shanghai, China (2020)
	BASE
	Show details

7	Preschoolers' Attention to Emotional Prosody as a Function of Speaker Conventionality ...
	Wieczorek, Karolina Marta. - : Arts, 2020
	BASE
	Show details

8	Learning to adapt: meta-learning approaches for speaker adaptation
	Klejch, Ondrej. - : The University of Edinburgh, 2020
	BASE
	Show details

9	Introducing Phonetic Information to Speaker Embedding for Speaker Verification
	Liu, Yi; He, Liang; Johnson, Michael T.
	In: Electrical and Computer Engineering Faculty Publications (2019)
	BASE
	Show details

10	Speaker-Adapted Confidence Measures for ASR using Deep Bidirectional Recurrent Neural Networks
	Del Agua Teba, Miguel Angel; Giménez Pastor, Adrián; Sanchis Navarro, José Alberto. - : Institute of Electrical and Electronics Engineers, 2018
	BASE
	Show details

11	Extending the Cascaded Gaussian Mixture Regression Framework for Cross-Speaker Acoustic-Articulatory Mapping
	Girin, Laurent; Hueber, Thomas; Alameda-Pineda, Xavier
	In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.archives-ouvertes.fr/hal-01485540 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2017, 25 (3), pp.662-673. ⟨10.1109/TASLP.2017.2651398⟩ (2017)
	BASE
	Show details

12	Articulatory representations to address acoustic variability in speech ...
	Sivaraman, Ganesh. - : Digital Repository at the University of Maryland, 2017
	BASE
	Show details

13	Articulatory representations to address acoustic variability in speech
	Sivaraman, Ganesh. - 2017
	BASE
	Show details

14	Adaptation au locuteur pour la séparation de la parole par NMF
	Doras, Guillaume
	In: https://hal.sorbonne-universite.fr/hal-01482183 ; [Stage] STMS - Sciences et Technologies de la Musique et du Son UMR 9912 IRCAM-CNRS-UPMC. 2016 (2016)
	BASE
	Show details

15	Iterative PLDA Adaptation for Speaker Diarization
	Le Lan, Gaël; Charlet, Delphine; Larcher, Anthony...
	In: Interspeech 2016 ; https://hal.archives-ouvertes.fr/hal-01433172 ; Interspeech 2016, Sep 2016, San Francisco, United States. pp.2175 - 2179, ⟨10.21437/Interspeech.2016-572⟩ (2016)
	BASE
	Show details

16	Speaker-dependent Multipitch Tracking Using Deep Neural Networks
	Liu,Yuzhou; Wang,DeLiang. - 2015
	BASE
	Show details

17	Phonetic reduction in spontaneous speech by children aged 9-14 years
	Tuomainen, O; Lee, C; Granlund, S...
	In: Presented at: 18th International Congress of Phonetic Sciences, Glasgow, UK. (2015) (2015)
	BASE
	Show details

18	The Acquisition of Vowel Normalization during Early Infancy: Theory and Computational Framework
	Plummer, Andrew R
	In: http://rave.ohiolink.edu/etdc/view?acc_num=osu1388689249 (2014)
	BASE
	Show details

19	'All the better for not seeing you': effects of communicative context on the speech of an individual with acquired communication difficulties.
	Bruce, C; Braidwood, U; Newton, C
	In: J Commun Disord , 46 (5-6) 475 - 483. (2013) (2013)
	BASE
	Show details

20	Contributions to Adaptation on Automatic Speech Recognition and Multilingual Handwritten Text Recognition
	Del Agua Teba, Miguel Angel. - : Universitat Politècnica de València, 2013
	BASE
	Show details

Page: 1 2 3

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern