DE eng

Search in the Catalogues and Directories

Hits 1 – 2 of 2

1
Online Versus Offline NMT Quality: An In-depth Analysis on English–German and German–English
In: COLING 2020 - 28th International Conference on Computational Linguistics ; https://hal.archives-ouvertes.fr/hal-02991539 ; COLING 2020 - 28th International Conference on Computational Linguistics, Dec 2020, Virtual, Spain. pp.5047-5058, ⟨10.18653/v1/2020.coling-main.443⟩ (2020)
BASE
Show details
2
Token-level and sequence-level loss smoothing for RNN language models
In: ACL - 56th Annual Meeting of the Association for Computational Linguistics ; https://hal.inria.fr/hal-01790879 ; ACL - 56th Annual Meeting of the Association for Computational Linguistics, Jul 2018, Melbourne, Australia. pp.2094-2103 ; https://aclanthology.info/papers/P18-1195/p18-1195 (2018)
Abstract: International audience ; Despite the effectiveness of recurrent neu-ral network language models, their maximum likelihood estimation suffers from two limitations. It treats all sentences that do not match the ground truth as equally poor, ignoring the structure of the output space. Second, it suffers from "exposure bias": during training tokens are predicted given ground-truth sequences, while at test time prediction is conditioned on generated output sequences. To overcome these limitations we build upon the recent reward augmented maximum likelihood approach i.e. sequence-level smoothing that encourages the model to predict sentences close to the ground truth according to a given performance metric. We extend this approach to token-level loss smoothing, and propose improvements to the sequence-level smoothing approach. Our experiments on two different tasks, image captioning and machine translation, show that token-level and sequence-level loss smoothing are complementary, and significantly improve results.
Keyword: [INFO.INFO-CL]Computer Science [cs]/Computation and Language [cs.CL]; [INFO.INFO-CV]Computer Science [cs]/Computer Vision and Pattern Recognition [cs.CV]; [INFO.INFO-LG]Computer Science [cs]/Machine Learning [cs.LG]
URL: https://hal.inria.fr/hal-01790879
https://hal.inria.fr/hal-01790879/document
https://hal.inria.fr/hal-01790879/file/paper.pdf
BASE
Hide details

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern