DE eng

Search in the Catalogues and Directories

Hits 1 – 4 of 4

1
On the Language-specificity of Multilingual BERT and the Impact of Fine-tuning ...
BASE
Show details
2
Data Augmentation for Speech Recognition in Maltese: A Low-Resource Perspective ...
Abstract: Developing speech technologies is a challenge for low-resource languages for which both annotated and raw speech data is sparse. Maltese is one such language. Recent years have seen an increased interest in the computational processing of Maltese, including speech technologies, but resources for the latter remain sparse. In this paper, we consider data augmentation techniques for improving speech recognition for such languages, focusing on Maltese as a test case. We consider three different types of data augmentation: unsupervised training, multilingual training and the use of synthesized speech as training data. The goal is to determine which of these techniques, or combination of them, is the most effective to improve speech recognition for languages where the starting point is a small corpus of approximately 7 hours of transcribed speech. Our results show that combining the three data augmentation techniques studied here lead us to an absolute WER improvement of 15% without the use of a language model. ... : 30 pages; 9 tables ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/2111.07793
https://dx.doi.org/10.48550/arxiv.2111.07793
BASE
Hide details
3
Automatic Removal of Identifying Information in Official EU Languages for Public Administrations: The MAPA Project
In: Legal Knowledge and Information Systems ; Frontiers in Artificial Intelligence and Applications ; International Conference on Legal Knowledge and Information Systems ; https://hal.archives-ouvertes.fr/hal-03058311 ; International Conference on Legal Knowledge and Information Systems, Dec 2020, Brno, Prague, Czech Republic. pp.223-226, ⟨10.3233/FAIA200869⟩ ; http://ebooks.iospress.nl/volume/legal-knowledge-and-information-systems-jurix-2020-the-thirty-third-annual-conference-brno-czech-republic-december-911-2020 (2020)
BASE
Show details
4
Creating Expert Knowledge by Relying on Language Learners: a Generic Approach for Mass-Producing Language Resources by Combining Implicit Crowdsourcing and Language Learning
In: LREC 2020 - Language Resources and Evaluation Conference ; https://hal.inria.fr/hal-02879883 ; LREC 2020 - Language Resources and Evaluation Conference, May 2020, Marseille, France (2020)
BASE
Show details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
4
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern