1 |
Charformer: Fast Character Transformers via Gradient-based Subword Tokenization ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Are Pretrained Convolutions Better than Pretrained Transformers? ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
StructFormer: Joint Unsupervised Induction of Dependency and Constituency Structure from Masked Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|