DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 27

1
Alternated Training with Synthetic and Authentic Data for Neural Machine Translation ...
Abstract: While synthetic bilingual corpora have demonstrated their effectiveness in low-resource neural machine translation (NMT), adding more synthetic data often deteriorates translation performance. In this work, we propose alternated training with synthetic and authentic data for NMT. The basic idea is to alternate synthetic and authentic corpora iteratively during training. Compared with previous work, we introduce authentic data as guidance to prevent the training of NMT models from being disturbed by noisy synthetic data. Experiments on Chinese-English and German-English translation tasks show that our approach improves the performance over several strong baselines. We visualize the BLEU landscape to further investigate the role of authentic and synthetic data during alternated training. From the visualization, we find that authentic data helps to direct the NMT model parameters towards points with higher BLEU scores and leads to consistent translation performance improvement. ... : ACL 2021, Short Findings ...
Keyword: Computation and Language cs.CL; FOS Computer and information sciences
URL: https://arxiv.org/abs/2106.08582
https://dx.doi.org/10.48550/arxiv.2106.08582
BASE
Hide details
2
CPM-2: Large-scale Cost-effective Pre-trained Language Models ...
Zhang, Zhengyan; Gu, Yuxian; Han, Xu. - : arXiv, 2021
BASE
Show details
3
VISITRON: Visual Semantics-Aligned Interactively Trained Object-Navigator ...
BASE
Show details
4
Assessing Multilingual Fairness in Pre-trained Multimodal Representations ...
Wang, Jialu; Liu, Yang; Wang, Xin Eric. - : arXiv, 2021
BASE
Show details
5
Dialog{S}um: {A} Real-Life Scenario Dialogue Summarization Dataset ...
BASE
Show details
6
Transfer Learning for Sequence Generation: from Single-source to Multi-source ...
BASE
Show details
7
Mask-Align: Self-Supervised Neural Word Alignment ...
BASE
Show details
8
Segment, Mask, and Predict: Augmenting Chinese Word Segmentation with Self-Supervision ...
BASE
Show details
9
Learning to Selectively Learn for Weakly-supervised Paraphrase Generation ...
BASE
Show details
10
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection ...
Jiang, Aiqi; Yang, Xiaohan; Liu, Yang. - : arXiv, 2021
BASE
Show details
11
Analyzing the Limits of Self-Supervision in Handling Bias in Language ...
BASE
Show details
12
Statistically significant detection of semantic shifts using contextual word embeddings ...
BASE
Show details
13
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection ...
Jiang, Aiqi; Xiaohan Yang; Liu, Yang. - : Zenodo, 2021
BASE
Show details
14
Statistically Significant Detection of Semantic Shifts using Contextual Word Embeddings ...
BASE
Show details
15
Leveraging Word-Formation Knowledge for Chinese Word Sense Disambiguation ...
BASE
Show details
16
SWSR: A Chinese Dataset and Lexicon for Online Sexism Detection ...
Jiang, Aiqi; Xiaohan Yang; Liu, Yang. - : Zenodo, 2021
BASE
Show details
17
SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection ...
Jiang, Aiqi; Xiaohan Yang; Liu, Yang. - : Zenodo, 2021
BASE
Show details
18
SWSR: A Chinese Dataset and Lexicon for Sexist Hate Speech Detection ...
Jiang, Aiqi; Xiaohan Yang; Liu, Yang. - : Zenodo, 2021
BASE
Show details
19
Tapping into non-English-language science for the conservation of global biodiversity. ...
Amano, Tatsuya; Berdejo-Espinola, Violeta; Christie, Alec. - : Apollo - University of Cambridge Repository, 2021
BASE
Show details
20
Tapping into non-English-language science for the conservation of global biodiversity. ...
Amano, Tatsuya; Berdejo-Espinola, Violeta; Christie, Alec P. - : Apollo - University of Cambridge Repository, 2021
BASE
Show details

Page: 1 2

Catalogues
Bibliographies
Linked Open Data catalogues
Online resources
Open access documents
27
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern