1 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention
|
|
|
|
In: International Workshop on Spoken Language Translation ; https://hal.archives-ouvertes.fr/hal-02343206 ; International Workshop on Spoken Language Translation, Nov 2019, Hong-Kong, China (2019)
|
|
BASE
|
|
Show details
|
|
2 |
Empirical Evaluation of Sequence-to-Sequence Models for Word Discovery in Low-resource Settings
|
|
|
|
In: Interspeech 2019 ; https://hal.archives-ouvertes.fr/hal-02193867 ; Interspeech 2019, Sep 2019, Graz, Austria (2019)
|
|
Abstract:
International audience ; Since Bahdanau et al. [1] first introduced attention for neural machine translation, most sequence-to-sequence models made use of attention mechanisms [2, 3, 4]. While they produce soft-alignment matrices that could be interpreted as alignment between target and source languages, we lack metrics to quantify their quality, being unclear which approach produces the best alignments. This paper presents an empirical evaluation of 3 of the main sequence-to-sequence models for word discovery from unsegmented phoneme sequences: CNN, RNN and Transformer-based. This task consists in aligning word sequences in a source language with phoneme sequences in a target language, inferring from it word segmentation on the target side [5]. Evaluating word segmentation quality can be seen as an extrinsic evaluation of the soft-alignment matrices produced during training. Our experiments in a low-resource scenario on Mboshi and English languages (both aligned to French) show that RNNs surprisingly outperform CNNs and Transformer for this task. Our results are confirmed by an intrinsic evaluation of alignment quality through the use Average Normalized Entropy (ANE). Lastly, we improve our best word discovery model by using an alignment entropy confidence measure that accumulates ANE over all the occurrences of a given alignment pair in the collection.
|
|
Keyword:
[INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; computational language documentation; low-resource languages; sequence-to-sequence models; soft-alignment matrices; word discovery
|
|
URL: https://hal.archives-ouvertes.fr/hal-02193867/file/IS2019marcely-camera-ready.pdf https://hal.archives-ouvertes.fr/hal-02193867 https://hal.archives-ouvertes.fr/hal-02193867/document
|
|
BASE
|
|
Hide details
|
|
3 |
Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech
|
|
|
|
In: Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL) ; https://hal.archives-ouvertes.fr/hal-02359540 ; Proceedings of the 23rd Conference on Computational Natural Language Learning (CoNLL), Nov 2019, Hong Kong, China. pp.339-348, ⟨10.18653/v1/K19-1032⟩ (2019)
|
|
BASE
|
|
Show details
|
|
4 |
The Zero Resource Speech Challenge 2019: TTS without T
|
|
|
|
In: Interspeech 2019 - 20th Annual Conference of the International Speech Communication Association ; https://hal.archives-ouvertes.fr/hal-02274112 ; Interspeech 2019 - 20th Annual Conference of the International Speech Communication Association, Sep 2019, Graz, Austria (2019)
|
|
BASE
|
|
Show details
|
|
5 |
Models of Visually Grounded Speech Signal Pay Attention to Nouns: A Bilingual Experiment on English and Japanese
|
|
|
|
In: International Conference on Acoustics, Speech and Signal Processing (ICASSP) ; https://hal.archives-ouvertes.fr/hal-02013984 ; International Conference on Acoustics, Speech and Signal Processing (ICASSP), May 2019, Brighton, United Kingdom. pp.8618-8622, ⟨10.1109/ICASSP.2019.8683069⟩ (2019)
|
|
BASE
|
|
Show details
|
|
6 |
A neural approach for inducing multilingual resources and natural language processing tools for low-resource languages
|
|
|
|
In: ISSN: 1351-3249 ; EISSN: 1469-8110 ; Natural Language Engineering ; https://hal.archives-ouvertes.fr/hal-01976297 ; Natural Language Engineering, Cambridge University Press (CUP), 2019, 25 (01), pp.43-67. ⟨10.1017/S1351324918000293⟩ (2019)
|
|
BASE
|
|
Show details
|
|
7 |
How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages
|
|
|
|
In: Journées Scientifiques du Groupement de Recherche: Linguistique Informatique, Formelle et de Terrain (LIFT). ; https://hal.archives-ouvertes.fr/hal-02895895 ; Journées Scientifiques du Groupement de Recherche: Linguistique Informatique, Formelle et de Terrain (LIFT)., Nov 2019, Orléans, France (2019)
|
|
BASE
|
|
Show details
|
|
8 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Controlling Utterance Length in NMT-based Word Segmentation with Attention ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
MaSS - Multilingual corpus of Sentence-aligned Spoken utterances ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
MaSS - Multilingual corpus of Sentence-aligned Spoken utterances ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
How Does Language Influence Documentation Workflow? Unsupervised Word Discovery Using Translations in Multiple Languages ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Word Recognition, Competition, and Activation in a Model of Visually Grounded Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Models of Visually Grounded Speech Signal Pay Attention To Nouns: a Bilingual Experiment on English and Japanese ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|