2 |
KIT Lecture Translator: Multilingual Speech Translation with One-Shot Learning
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Lecture Translator Speech translation framework for simultaneous lecture translation
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Open Source Toolkit for Speech to Text Translation
|
|
|
|
In: The Prague Bulletin of Mathematical Linguistics, 111 (1), 125–135 ; ISSN: 0032-6585, 1804-0462 (2022)
|
|
BASE
|
|
Show details
|
|
7 |
Efficient Weight factorization for Multilingual Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Efficientweight factorization for multilingual speech recognition
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Super-Human Performance in Online Low-latency Recognition of Conversational Speech ...
|
|
|
|
Abstract:
Achieving super-human performance in recognizing human speech has been a goal for several decades, as researchers have worked on increasingly challenging tasks. In the 1990's it was discovered, that conversational speech between two humans turns out to be considerably more difficult than read speech as hesitations, disfluencies, false starts and sloppy articulation complicate acoustic processing and require robust handling of acoustic, lexical and language context, jointly. Early attempts with statistical models could only reach error rates over 50% and far from human performance (WER of around 5.5%). Neural hybrid models and recent attention-based encoder-decoder models have considerably improved performance as such contexts can now be learned in an integral fashion. However, processing such contexts requires an entire utterance presentation and thus introduces unwanted delays before a recognition result can be output. In this paper, we address performance as well as latency. We present results for a system ... : To appear in Interspeech 2021 ...
|
|
Keyword:
Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2010.03449 https://arxiv.org/abs/2010.03449
|
|
BASE
|
|
Hide details
|
|
12 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the "Speaking Rosetta" JSALT 2017 Workshop ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
19: Grundlagen der Automatischen Spracherkennung, Vorlesung, WS 2017/18, 24.01.2018
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Open Source Toolkit for Speech to Text Translation
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 111, Iss 1, Pp 125-135 (2018) (2018)
|
|
BASE
|
|
Show details
|
|
16 |
Phonemic and Graphemic Multilingual CTC Based Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Comparison of Decoding Strategies for CTC Acoustic Models ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
13: Grundbegriffe der Informatik, Vorlesung, WS 2017/18, 01.12.2017
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Grundlagen der Automatischen Spracherkennung, Vorlesung, WS 2016/17, 18.01.2017, 18
|
|
|
|
BASE
|
|
Show details
|
|
20 |
03: Grundlagen der Automatischen Spracherkennung, Vorlesung, WS 2017/18, 30.10.2017
|
|
|
|
BASE
|
|
Show details
|
|
|
|