1 |
How2Sign: A large-scale multimodal dataset for continuous American sign language
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Multilingual Multimodal Pre-training for Zero-Shot Cross-Lingual Transfer of Vision-Language Models ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Differentiable Allophone Graphs for Language-Universal Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Speech technology for unwritten languages
|
|
|
|
In: ISSN: 2329-9290 ; EISSN: 2329-9304 ; IEEE/ACM Transactions on Audio, Speech and Language Processing ; https://hal.inria.fr/hal-02480675 ; IEEE/ACM Transactions on Audio, Speech and Language Processing, Institute of Electrical and Electronics Engineers, 2020, ⟨10.1109/TASLP.2020.2973896⟩ (2020)
|
|
BASE
|
|
Show details
|
|
6 |
AlloVera: a multilingual allophone database
|
|
|
|
In: LREC 2020: 12th Language Resources and Evaluation Conference ; https://halshs.archives-ouvertes.fr/halshs-02527046 ; LREC 2020: 12th Language Resources and Evaluation Conference, European Language Resources Association, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
8 |
Towards Zero-shot Learning for Automatic Phonemic Transcription ...
|
|
|
|
Abstract:
Automatic phonemic transcription tools are useful for low-resource language documentation. However, due to the lack of training sets, only a tiny fraction of languages have phonemic transcription tools. Fortunately, multilingual acoustic modeling provides a solution given limited audio training data. A more challenging problem is to build phonemic transcribers for languages with zero training data. The difficulty of this task is that phoneme inventories often differ between the training languages and the target language, making it infeasible to recognize unseen phonemes. In this work, we address this problem by adopting the idea of zero-shot learning. Our model is able to recognize unseen phonemes in the target language without any training data. In our model, we decompose phonemes into corresponding articulatory attributes such as vowel and consonant. Instead of predicting phonemes directly, we first predict distributions over articulatory attributes, and then compute phoneme distributions with a customized ... : AAAI 2020 ...
|
|
Keyword:
Audio and Speech Processing eess.AS; Computation and Language cs.CL; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
|
|
URL: https://dx.doi.org/10.48550/arxiv.2002.11781 https://arxiv.org/abs/2002.11781
|
|
BASE
|
|
Hide details
|
|
9 |
How2Sign: A Large-scale Multimodal Dataset for Continuous American Sign Language ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Universal Phone Recognition with a Multilingual Allophone System ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
AlloVera: a multilingual allophone database
|
|
|
|
In: LREC 2020: 12th Language Resources and Evaluation Conference ; https://halshs.archives-ouvertes.fr/halshs-02527046 ; LREC 2020: 12th Language Resources and Evaluation Conference, European Language Resources Association, May 2020, Marseille, France ; https://lrec2020.lrec-conf.org/ (2020)
|
|
BASE
|
|
Show details
|
|
12 |
Phoneme Level Language Models for Sequence Based Low Resource ASR ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Multilingual Speech Recognition with Corpus Relatedness Sampling ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
On Leveraging the Visual Modality for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Acoustic-to-Word Models with Conversational Context Information ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Learned In Speech Recognition: Contextual Acoustic Word Embeddings ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
On Dimensional Linguistic Properties of the Word Embedding Space ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Linguistic unit discovery from multi-modal inputs in unwritten languages: Summary of the “Speaking rosetta” JSALT 2017 workshop
|
|
|
|
In: ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.archives-ouvertes.fr/hal-01709578 ; ICASSP 2018 - IEEE International Conference on Acoustics, Speech and Signal Processing, Apr 2018, Calgary, Alberta, Canada (2018)
|
|
BASE
|
|
Show details
|
|
19 |
Late fusion of individual engines for improved recognition of negative emotion in speech - learning vs. democratic vote ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Sequence-based Multi-lingual Low Resource Speech Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|