1 |
Cascade or Direct Speech Translation? A Case Study
|
|
|
|
In: Applied Sciences; Volume 12; Issue 3; Pages: 1097 (2022)
|
|
Abstract:
Speech translation has been traditionally tackled under a cascade approach, chaining speech recognition and machine translation components to translate from an audio source in a given language into text or speech in a target language. Leveraging on deep learning approaches to natural language processing, recent studies have explored the potential of direct end-to-end neural modelling to perform the speech translation task. Though several benefits may come from end-to-end modelling, such as a reduction in latency and error propagation, the comparative merits of each approach still deserve detailed evaluations and analyses. In this work, we compared state-of-the-art cascade and direct approaches on the under-resourced Basque–Spanish language pair, which features challenging phenomena such as marked differences in morphology and word order. This case study thus complements other studies in the field, which mostly revolve around the English language. We describe and analysed in detail the mintzai-ST corpus, prepared from the sessions of the Basque Parliament, and evaluated the strengths and limitations of cascade and direct speech translation models trained on this corpus, with variants exploiting additional data as well. Our results indicated that, despite significant progress with end-to-end models, which may outperform alternatives in some cases in terms of automated metrics, a cascade approach proved optimal overall in our experiments and manual evaluations.
|
|
Keyword:
Basque; cascade speech translation; corpus; direct speech translation; Spanish; speech translation
|
|
URL: https://doi.org/10.3390/app12031097
|
|
BASE
|
|
Hide details
|
|
2 |
Reproduced speech in South American presidential inauguration addresses ; El discurso reproducido en discursos de posesión presidencial sudamericanos
|
|
|
|
In: Journal of Linguistic Research; Vol. 24 (2021): El discurso político en las redes sociales; 147-175 ; Revista de Investigación Lingüística; Vol. 24 (2021): El discurso político en las redes sociales; 147-175 ; 1989-4554 ; 1139-1146 (2022)
|
|
BASE
|
|
Show details
|
|
3 |
Functional maps of direct electrical stimulation-induced speech arrest and anomia: a multicentre retrospective study.
|
|
|
|
In: Brain : a journal of neurology, vol 144, iss 8 (2021)
|
|
BASE
|
|
Show details
|
|
4 |
Bihemispheric and Cathodal Transcranial Direct Current Stimulation Improves Fluency in Reading but not in Conversation ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
A Longitudinal Analysis of Spanish Morphosyntactic Performance Based on Spanish-English Bilingual Exposure and Usage ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Bihemispheric and Cathodal Transcranial Direct Current Stimulation Improves Fluency in Reading but not in Conversation ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Bihemispheric and Cathodal Transcranial Direct Current Stimulation Improves Reading Fluency in Adults Who Stutter ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Sustained neural rhythms reveal endogenous oscillations supporting speech perception. ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Detection, Speech Recognition, Loudness, and Preference Outcomes With a Direct Drive Hearing Aid: Effects of Bandwidth
|
|
|
|
In: Communication Sciences and Disorders Publications (2021)
|
|
BASE
|
|
Show details
|
|
12 |
L’attentat de la Grande Mosquée de Québec : discours des politiciens rapportés dans la presse
|
|
|
|
In: Argumentum: Journal of the Seminar of Discursive Logic, Argumentation Theory and Rhetoric, Vol 19, Iss 2, Pp 205-230 (2021) (2021)
|
|
BASE
|
|
Show details
|
|
13 |
La naissance de Marie-Blanche de Grignan. Notes sur la mise en page de la polyphonie sévignéenne
|
|
|
|
In: ISSN: 2496-5731 ; Acta Litt&Arts [En ligne] ; https://hal.archives-ouvertes.fr/hal-01900042 ; Acta Litt&Arts [En ligne], Grenoble: Université Grenoble Alpes, 2020, Les discours rapportés en contexte épistolaire (XVIe-XVIIIe siècles), http://ouvroir-litt-arts.univ-grenoble-alpes.fr/revues/actalittarts/616 (2020)
|
|
BASE
|
|
Show details
|
|
14 |
(Be) like en anglais, genre en français : de la prosodie comme commentaire subjectif
|
|
|
|
In: ISSN: 1638-1718 ; EISSN: 1638-1718 ; E-rea - Revue électronique d’études sur le monde anglophone ; https://hal.archives-ouvertes.fr/hal-03272291 ; E-rea - Revue électronique d’études sur le monde anglophone, Laboratoire d’Études et de Recherche sur le Monde Anglophone, 2020, 1. Le discours rapporté et l’expression de la subjectivité, 23 p. ⟨10.4000/erea.10023⟩ (2020)
|
|
BASE
|
|
Show details
|
|
15 |
Diamesic Variation in Direct Reported Speech: Representing Orality in Fiction
|
|
|
|
In: ISSN: 0557-6989 ; Recherches Anglaises et Nord Americaines ; https://hal-amu.archives-ouvertes.fr/hal-02969797 ; Recherches Anglaises et Nord Americaines, Presses Universitaires de Strasbourg, 2020, Internal Variation : A Special Focus on Diamesic Variation, 53, pp.99-114 ; http://pus.unistra.fr/fr/livre/?GCOI=28682100177140 (2020)
|
|
BASE
|
|
Show details
|
|
16 |
Non-invasive brain stimulation as add-on therapy for subacute post-stroke aphasia: A randomized trial (NORTHSTAR)
|
|
|
|
In: Research outputs 2014 to 2021 (2020)
|
|
BASE
|
|
Show details
|
|
19 |
LINGUOPOETICS OF ANOTHER'S DIRECT SPEECH IN UZBEK LANGUAGE ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
LINGUOPOETICS OF ANOTHER'S DIRECT SPEECH IN UZBEK LANGUAGE ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|