1 |
Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Self-Training Sampling with Monolingual Data Uncertainty for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
On the Copying Behaviors of Pre-Training for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Multi-Task Learning with Shared Encoder for Non-Autoregressive Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
On the Inference Calibration of Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
EmpDG: Multi-resolution Interactive Empathetic Dialogue Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Assessing the Bilingual Knowledge Learned by Neural Machine Translation Models ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Understanding and Improving Lexical Choice in Non-Autoregressive Translation ...
|
|
|
|
Abstract:
Knowledge distillation (KD) is essential for training non-autoregressive translation (NAT) models by reducing the complexity of the raw data with an autoregressive teacher model. In this study, we empirically show that as a side effect of this training, the lexical choice errors on low-frequency words are propagated to the NAT model from the teacher model. To alleviate this problem, we propose to expose the raw data to NAT models to restore the useful information of low-frequency words, which are missed in the distilled data. To this end, we introduce an extra Kullback-Leibler divergence term derived by comparing the lexical choice of NAT model and that embedded in the raw data. Experimental results across language pairs and model architectures demonstrate the effectiveness and universality of the proposed approach. Extensive analyses confirm our claim that our approach improves performance by reducing the lexical choice errors on low-frequency words. Encouragingly, our approach pushes the SOTA NAT ... : ICLR 2021 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2012.14583 https://arxiv.org/abs/2012.14583
|
|
BASE
|
|
Hide details
|
|
10 |
Information Aggregation for Multi-Head Attention with Routing-by-Agreement ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Neuron Interaction Based Representation Composition for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Multi-Granularity Self-Attention for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Towards Understanding Neural Machine Translation with Word Importance ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Towards Better Modeling Hierarchical Structure for Self-Attention with Ordered Neurons ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Translating pro-drop languages with reconstruction models
|
|
|
|
In: Wang, Longyue orcid:0000-0002-9062-6183 , Tu, Zhaopeng, Shi, Shuming, Zhang, Tong, Graham, Yvette and Liu, Qun orcid:0000-0002-7000-1792 (2018) Translating pro-drop languages with reconstruction models. In: Thirty-Second AAAI Conference on Artificial Intelligence (AAAI-18), 2–7 Feb 2018, New Orleans, LA, USA. ISBN 978-1-57735-800-8 (2018)
|
|
BASE
|
|
Show details
|
|
16 |
Translating pro-drop languages with reconstruction models
|
|
|
|
In: Wang, Longyue orcid:0000-0002-9062-6183 , Tu, Zhaopeng, Shi, Shuming, Zhang, Tong, Graham, Yvette and Liu, Qun orcid:0000-0002-7000-1792 (2018) Translating pro-drop languages with reconstruction models. In: 32nd AAAI Conference on Artificial Intelligence (AAAI 2018), 2 - 7 Feb 2018, New Orleans, LA, USA. ISBN 978-1-57735-800-8 (2018)
|
|
BASE
|
|
Show details
|
|
17 |
Translating Pro-Drop Languages with Reconstruction Models ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Exploiting Deep Representations for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Exploiting cross-sentence context for neural machine translation
|
|
|
|
In: Wang, Longyue orcid:0000-0002-9062-6183 , Tu, Zhaopeng, Way, Andy orcid:0000-0001-5736-5930 and Liu, Qun orcid:0000-0002-7000-1792 (2017) Exploiting cross-sentence context for neural machine translation. In: 2017 Conference on Empirical Methods in Natural Language Processing, 7-8 Sept 2017, Copenhagen, Denmark. (2017)
|
|
BASE
|
|
Show details
|
|
|
|