2 |
huggingface/transformers: v4.4.0: S2T, M2M100, I-BERT, mBART-50, DeBERTa-v2, XLSR-Wav2Vec2 ...
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Transformers: State-of-the-Art Natural Language Processing ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Transformers: State-of-the-Art Natural Language Processing ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
huggingface/transformers: ProphetNet, Blenderbot, SqueezeBERT, DeBERTa ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
huggingface/transformers: Trainer, TFTrainer, Multilingual BART, Encoder-decoder improvements, Generation Pipeline ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
huggingface/pytorch-transformers: DistilBERT, GPT-2 Large, XLM multilingual models, bug fixes ...
|
|
Wolf, Thomas; Debut, Lysandre; SANH, Victor; , Denis; , Matt; Châtel, Grégory; Chaumond, Julien; Rault, Tim; Catalin Voss; Wang, Fei; Pietsch, Malte; Fiocco, Davide; Dhanajitb; Schweter, Stefan; Ananya Harsh Jha; Yzy5630; Yongbo Wang; Shijie Wu; Subies, Guillem García; Weixin Wang; Zeyao Du; Chi-Liang, Liu; Korolev, Nikolay; Grus, Joel; Abbott, Jade; Pollack, David; Matej-Svejda; , Clement; Ailing; Abhishek Rao. - : Zenodo, 2019
|
|
Abstract:
New model architecture: DistilBERT Adding Huggingface's new transformer architecture, DistilBERT described in Smaller, faster, cheaper, lighter: Introducing DistilBERT, a distilled version of BERT by Victor Sanh, Lysandre Debut and Thomas Wolf. This new model architecture comes with two pretrained checkpoints: distilbert-base-uncased : the base DistilBert model distilbert-base-uncased-distilled-squad : DistilBert model fine-tuned with distillation on SQuAD. An awaited new pretrained checkpoint: GPT-2 large (774M parameters) The third OpenAI GPT-2 checkpoint (GPT-2 large) is available in the library under the shortcut name gpt2-large : 774M parameters, 36 layers, and 20 heads. New XLM multilingual pretrained checkpoints in 17 and 100 languages We have added two new XLM models in 17 and 100 languages which obtain better performance than multilingual BERT on the XNLI cross-lingual classification task. New dependency: sacremoses Support for XLM is improved by carefully reproducing the original tokenization ...
|
|
URL: https://dx.doi.org/10.5281/zenodo.3385998 https://zenodo.org/record/3385998
|
|
BASE
|
|
Hide details
|
|
|
|