1 |
How to train your self-supervised NLP model: Investigating pre-training objectives, data, and scale
|
|
|
|
BASE
|
|
Show details
|
|
4 |
DESCGEN: A Distantly Supervised Datasetfor Generating Entity Descriptions ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Prompting Contrastive Explanations for Commonsense Reasoning Tasks ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Detecting Hallucinated Content in Conditional Neural Sequence Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
FaVIQ: FAct Verification from Information-seeking Questions ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Quantifying Adaptability in Pre-trained Language Models with 500 Tasks ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Do Syntactic Probes Probe Syntax? Experiments with Jabberwocky Probing
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
11 |
What About the Precedent: An Information-Theoretic Analysis of Common Law
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
12 |
Finding Concept-specific Biases in Form–Meaning Associations
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
13 |
A Non-Linear Structural Probe
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
14 |
How (Non-)Optimal is the Lexicon?
|
|
|
|
In: Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies (2021)
|
|
BASE
|
|
Show details
|
|
15 |
Bilingual Lexicon Induction via Unsupervised Bitext Construction and Word Alignment ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Backtranslation feedback improves user confidence in MT, not quality
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Nearest Neighbor Machine Translation ...
|
|
|
|
Abstract:
We introduce $k$-nearest-neighbor machine translation ($k$NN-MT), which predicts tokens with a nearest neighbor classifier over a large datastore of cached examples, using representations from a neural translation model for similarity search. This approach requires no additional training and scales to give the decoder direct access to billions of examples at test time, resulting in a highly expressive model that consistently improves performance across many settings. Simply adding nearest neighbor search improves a state-of-the-art German-English translation model by 1.5 BLEU. $k$NN-MT allows a single model to be adapted to diverse domains by using a domain-specific datastore, improving results by an average of 9.2 BLEU over zero-shot transfer, and achieving new state-of-the-art results -- without training on these domains. A massively multilingual model can also be specialized for particular language pairs, with improvements of 3 BLEU for translating from English into German and Chinese. Qualitatively, ... : ICLR 2021 ...
|
|
Keyword:
Computation and Language cs.CL; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/2010.00710 https://dx.doi.org/10.48550/arxiv.2010.00710
|
|
BASE
|
|
Hide details
|
|
19 |
Multilingual Denoising Pre-training for Neural Machine Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|