Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2

Hits 1 – 20 of 22

1	Repairing Swedish Automatic Speech Recognition ; Korrigering av Automatisk Taligenkänning för Svenska
	Rehn, Karla. - : KTH, Skolan för elektroteknik och datavetenskap (EECS), 2021
	Abstract: The quality of automatic speech recognition has increased dramatically the last few years, but the performance for low and middle resource languages such as Swedish is still far from optimal. In this project a language model trained on large written corpora called KB-BERT is utilized to improve the quality of transcriptions for Swedish. The large language model is inserted as a repairing module after the automatic speech recognition, aiming to repair the original output into a transcription more closely resembling the ground truth by using a sequence to sequence translating approach. Two automatic speech recognition models are used to transcribe the speech, one of the models are developed in this project using the Kaldi framework, the other model is Microsoft’s Azure Speech to text platform. The performance of the translator is evaluated with four different datasets, three consisting of read speech and one of spontaneous speech. The spontaneous speech and one of the read datasets include both native and non-native speakers. The performance is measured by three different metrics, word error rate, a weighted word error rate and a semantic similarity. The repairs improve the transcriptions of two of the read speech datasets significantly, decreasing the word error rate from 13.69% to 3.05% and from 36.23% to 21.17%. The repairs improve the word error rate from 44.38% to 44.06% on the data with spontaneous speech, and fail on the last read dataset, instead increasing the word error rate. The lower performance on the latter is likely due to lack of data. ; Automatisk taligenkänning har förbättrats de senaste åren, men för små språk såsom svenska är prestandan fortfarande långt ifrån optimal. Det här projektet använder KB-BERT, en neural språkmodell tränad på stora mängder skriven text, för att förbättra kvalitén på transkriptioner av svenskt tal. Transkriptionerna kommer från två olika taligenkänningsmodeller, dels en utvecklad i det här projektet med hjälp av mjukvarubiblioteket Kaldi, dels Microsoft Azures plattform för tal till text. Transkriptionerna repareras med hjälp av en sequence-to-sequence översättningsmodell, och KB-BERT används för att initiera modellen. Översättningen sker från den urpsrungliga transkriptionen från en av tal-till-text-modellerna till en transkription som är mer lik den korrekta, faktiska transkriptionen. Kvalitéen på reparationerna evalueras med tre olika metriker, på fyra olika dataset. Tre av dataseten är läst tal och det fjärde spontant, och det spontana talet samt ett av de lästa dataseten kommer både från talare som har svenska som modersmål, och talare som har det som andraspråk. De tre metrikerna är word error rate, en viktad word error rate, samt ett mått på semantisk likhet. Reparationerna förbättrar transkriptionerna från två av de lästa dataseten markant, och sänker word error rate från 13.69% till 3.05% och från 36.23% till 21.17%. På det spontana talet sänks word error rate från 44.38% till 44.06%. Reparationerna misslyckas på det fjärde datasetet, troligen på grund av dess lilla storlek.
	Keyword: ASR Repair; Automatic speech recognition; Automatisk taligenkänning; Computer and Information Sciences; Data- och informationsvetenskap; Dialogsystem; Dialogue systems; Language models; Reparation av taligenkänning; Språkmodeller
	URL: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-305922
	BASE
	Hide details

2	Developing discourse structure analysis for use on conversations that include people with aphasia
	Gulick, Eleanor
	In: http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1594159643173734 (2020)
	BASE
	Show details

3	Neural mechanisms for monitoring and halting of spoken word production
	Hansen, Samuel J.; McMahon, Katie L.; de Zubicaray, Greig I.. - : MIT Press, 2019
	BASE
	Show details

4	Conversational trouble and repair in dementia: revision of an existing coding framework
	Sluis, Rachel A.; Campbell, Alana; Atay, Christina. - : Elsevier, 2019
	BASE
	Show details

5	OTHER-INITIATED SELF-REPAIRS IN STUDENT-STUDENT INTERACTION: THE FREQUENCY OF OCCURRENCE AND MECHANISM
	Denanda Pratiwi Putry; Ahmad Munir; Oikurema Purwati
	In: JEELS (Journal of English Education and Linguistics Studies), Vol 6, Iss 1, Pp 91-110 (2019) (2019)
	BASE
	Show details

6	The role of L2 experience in L1 phonotactic restructuring in sequential bilinguals ...
	Alcorn, Steven Michael; 0000-0002-3199-1826. - : The University of Texas at Austin, 2018
	BASE
	Show details

7	The role of L2 experience in L1 phonotactic restructuring in sequential bilinguals
	Alcorn, Steven Michael. - 2018
	BASE
	Show details

8	The Use of Gesture in Self-Initiated Self-Repair Sequences by Persons with Non-Fluent Aphasia
	Feltner, Eleanor M.
	In: Theses and Dissertations--Linguistics (2016)
	BASE
	Show details

9	Conversation breakdowns in the audiology clinic: the importance of mutual gaze
	Ekberg, Katie; Hickson, Louise; Grenness, Caitlin. - : John Wiley & Sons, 2016
	BASE
	Show details

10	Who said what? Sampling conversation repair behavior involving adults with acquired hearing impairment
	Lind, Christopher; Hickson, Louise; Erber, Norman. - : Thieme Medical Publishers, 2010
	BASE
	Show details

11	Conversational Repair Strategies in Adolescents with Autism Spectrum Disorders
	Philip, Biji A.
	In: http://rave.ohiolink.edu/etdc/view?acc_num=bgsu1225745290 (2008)
	BASE
	Show details

12	Dental-to-velar perceptual assimilation: A cross-linguistic study of the perception of dental stop+/l/ clusters
	Hallé, Pierre; Best, Catherine
	In: https://halshs.archives-ouvertes.fr/halshs-00129735 ; 2007 (2007)
	BASE
	Show details

13	Conversation repair and adult cochlear implantation: A qualitative case study
	Lind, C.; Hickson, L. M. H.; Erber, N.. - : John Wiley & Sons, 2006
	BASE
	Show details

14	Exchange of disfluency with age from function words to content words in Spanish speakers who stutter
	Au-Yeung, J; Gomez, IV; Howell, P
	In: J SPEECH LANG HEAR R , 46 (3) 754 - 765. (2003) (2003)
	BASE
	Show details

15	Interactive Electronic Technical Manuals (IETMs) Annotated Bibliography
	Siegel, Jane; Nawrocki, Elise
	In: DTIC (2002)
	BASE
	Show details

16	Phonetic Consequences of Speech Disfluency
	Shriberg, Elizabeth E.
	In: DTIC (1999)
	BASE
	Show details

17	Utterance rate and linguistic properties as determinants of lexical dysfluencies in children who stutter
	Howell, P; Au-Yeung, J; Pilgrim, L
	In: J ACOUST SOC AM , 105 (1) 481 - 490. (1999) (1999)
	BASE
	Show details

18	Putting People First: Specifying Proper Names in Speech Interfaces
	Matt Marx; Chris Schmandt
	In: http://www.media.mit.edu/speech/papers/1994/marx_UIST94_putting_people_first.ps.gz (1994)
	BASE
	Show details

19	Detection and Correction of Repairs in Human-Computer Dialog
	Bear, John; Dowding, John; Shriberg, Elizabeth
	In: DTIC (1992)
	BASE
	Show details

20	Research in Knowledge Representation for Natural Language Communication and Planning Assistance
	Goodman, Bradley A.; Grosz, B.; Haas, A....
	In: DTIC AND NTIS (1988)
	BASE
	Show details

Page: 1 2

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern