1 |
Automated speech tools for helping communities process restricted-access corpora for language revival efforts ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Experiment artefacts: DTW search and evaluation (main and pilot experiments) ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Leveraging pre-trained representations to improve access to untranscribed speech from endangered languages ...
|
|
San, Nay; Bartelds, Martijn; Browne, Mitchell; Clifford, Lily; Gibson, Fiona; Mansfield, John; Nash, David; Simpson, Jane; Turpin, Myfany; Vollmer, Maria; Wilmoth, Sasha; Jurafsky, Dan. - : arXiv, 2021
|
|
Abstract:
Pre-trained speech representations like wav2vec 2.0 are a powerful tool for automatic speech recognition (ASR). Yet many endangered languages lack sufficient data for pre-training such models, or are predominantly oral vernaculars without a standardised writing system, precluding fine-tuning. Query-by-example spoken term detection (QbE-STD) offers an alternative for iteratively indexing untranscribed speech corpora by locating spoken query terms. Using data from 7 Australian Aboriginal languages and a regional variety of Dutch, all of which are endangered or vulnerable, we show that QbE-STD can be improved by leveraging representations developed for ASR (wav2vec 2.0: the English monolingual model and XLSR53 multilingual model). Surprisingly, the English model outperformed the multilingual model on 4 Australian language datasets, raising questions around how to optimally leverage self-supervised speech representations for QbE-STD. Nevertheless, we find that wav2vec 2.0 representations (either English or ... : Accepted at ASRU 2021 ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
|
|
URL: https://arxiv.org/abs/2103.14583 https://dx.doi.org/10.48550/arxiv.2103.14583
|
|
BASE
|
|
Hide details
|
|
6 |
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Adapting Monolingual Models: Data can be Scarce when Language Similarity is High ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
A New Acoustic-Based Pronunciation Distance Measure
|
|
|
|
In: Front Artif Intell (2020)
|
|
BASE
|
|
Show details
|
|
|
|