1 |
RobeCzech Base
|
|
|
|
Abstract:
RobeCzech is a monolingual RoBERTa language representation model trained on Czech data. RoBERTa is a robustly optimized Transformer-based pretraining approach. We show that RobeCzech considerably outperforms equally-sized multilingual and Czech-trained contextualized language representation models, surpasses current state of the art in all five evaluated NLP tasks and reaches state-of-theart results in four of them. The RobeCzech model is released publicly at https://hdl.handle.net/11234/1-3691 and https://huggingface.co/ufal/robeczech-base, both for PyTorch and TensorFlow.
|
|
Keyword:
BERT; Czech; Czech language; RoBERTa
|
|
URL: http://hdl.handle.net/11234/1-3691
|
|
BASE
|
|
Hide details
|
|
4 |
Czech Models (MorfFlex CZ 160310 + PDT 3.0) for MorphoDiTa 160310
|
|
Straka, Milan; Straková, Jana. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2016
|
|
BASE
|
|
Show details
|
|
5 |
WordSim353-cs: Evaluation Dataset for Lexical Similarity and Relatedness, based on WordSim353
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Czech Models (MorfFlex CZ 161115 + PDT 3.0) for MorphoDiTa 161115
|
|
Straka, Milan; Straková, Jana. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2016
|
|
BASE
|
|
Show details
|
|
8 |
Czech Models (MorfFlex CZ + PDT) for MorphoDiTa
|
|
Straka, Milan; Straková, Jana. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
|
|
BASE
|
|
Show details
|
|
|
|