1 |
Better Neural Machine Translation by Extracting Linguistic Information from BERT ...
|
|
|
|
Abstract:
Adding linguistic information (syntax or semantics) to neural machine translation (NMT) has mostly focused on using point estimates from pre-trained models. Directly using the capacity of massive pre-trained contextual word embedding models such as BERT (Devlin et al., 2019) has been marginally useful in NMT because effective fine-tuning is difficult to obtain for NMT without making training brittle and unreliable. We augment NMT by extracting dense fine-tuned vector-based linguistic information from BERT instead of using point estimates. Experimental results show that our method of incorporating linguistic information helps NMT to generalize better in a variety of training contexts and is no more difficult to train than conventional Transformer-based NMT. ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2104.02831 https://arxiv.org/abs/2104.02831
|
|
BASE
|
|
Hide details
|
|
2 |
Pointer-based Fusion of Bilingual Lexicons into Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Top-down Tree Structured Decoding with Syntactic Connections for Neural Machine Translation and Parsing ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Analysis of Semi-Supervised Learning with the Yarowsky Algorithm ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Separating Dependency from Constituency in a Tree Rewriting System ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Coordination in Tree Adjoining Grammars: Formalization and Implementation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|