1 |
explosion/spaCy: v3.3.0: Improved speed, new trainable lemmatizer, and pipelines for Finnish, Korean and Swedish ...
|
|
|
|
BASE
|
|
Show details
|
|
2 |
explosion/spaCy: v3.3.0: Improved speed, new trainable lemmatizer, and pipelines for Finnish, Korean and Swedish ...
|
|
Montani, Ines; Honnibal, Matthew; Honnibal, Matthew; Van Landeghem, Sofie; Boyd, Adriane; Peters, Henning; McCann, Paul O'Leary; Samsonov, Maxim; Geovedi, Jim; O'Regan, Jim; Altinok, Duygu; Orosz, György; Kristiansen, Søren Lind; Lj Miranda; De Kok, Daniël; , Roman; Explosion Bot; Fiedler, Leander; Howard, Grégory; , Edward; Wannaphong Phatthiyaphaibun; Tamura, Yohei; Bozek, Sam; , Murat; Ryn Daniels; Amery, Mark; Böing, Björn; Vanroy, Bram; Tippa, Pradeep Kumar. - : Zenodo, 2022
|
|
Abstract:
✨ New features and improvements Improved speeds for many components, see speed benchmarks for trained pipelines: Speed up parser and NER by using constant-time head lookups (#10048). Support unnormalized softmax probabilities in spacy.Tagger.v2 to speed up inference for the tagger, morphologizer, senter and trainable lemmatizer (#10197). Speed up parser projectivization functions (#10241). Replace Ragged with faster AlignmentArray in Example for training (#10319). Improve Matcher speed (#10659). Improve serialization speed for empty Doc.spans (#10250). NEW : A trainable lemmatizer component that uses edit trees to transform tokens to lemmas. Add it to your config with spacy init config -p trainable_lemmatizer or using the quickstart. Language updates: Initial support for Lower Sorbian and Upper Sorbian. New noun chunks for Finnish. Updated noun chunks for French, Italian and Spanish. Additional updates for English, French, Italian, Japanese, Korean, Norwegian, Russian, Slovenian, Spanish, Turkish, Ukrainian ...
|
|
URL: https://dx.doi.org/10.5281/zenodo.6504092 https://zenodo.org/record/6504092
|
|
BASE
|
|
Hide details
|
|
|
|