Home Catalogue search

eng

Refine your search:
- Keyword:
- Creator / Publisher
- Year
- Medium
- Type
- BLLDB-Access:
  - free (21)
  - subject to license (0)

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2

Hits 1 – 20 of 21

1	Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages ...
	Ulčar, Matej; Robnik-Šikonja. - : Zenodo, 2022
	BASE
	Show details

2	Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages ...
	Ulčar, Matej; Robnik-Šikonja. - : Zenodo, 2022
	BASE
	Show details

3	Word-embedding based bilingual terminology alignment ...
	Repar, Andraž; Martinc, Matej; Ulčar, Matej. - : Zenodo, 2021
	BASE
	Show details

4	Word-embedding based bilingual terminology alignment ...
	Repar, Andraž; Martinc, Matej; Ulčar, Matej. - : Zenodo, 2021
	BASE
	Show details

5	List of single-word male and female occupations in Slovenian
	Supej, Anka; Ulčar, Matej; Robnik-Šikonja, Marko. - : Jožef Stefan Institute, 2021. : Faculty of Computer and Information Science, University of Ljubljana, 2021
	BASE
	Show details

6	SloBERTa: Slovene monolingual large pretrained masked language model ...
	Ulčar, Matej; Robnik-Šikonja, Marko. - : Zenodo, 2021
	BASE
	Show details

7	SloBERTa: Slovene monolingual large pretrained masked language model ...
	Ulčar, Matej; Robnik-Šikonja, Marko. - : Zenodo, 2021
	BASE
	Show details

8	Evaluation of contextual embeddings on less-resourced languages ...
	Ulčar, Matej; Žagar, Aleš; Armendariz, Carlos S.. - : arXiv, 2021
	BASE
	Show details

9	Training dataset and dictionary sizes matter in BERT models: the case of Baltic languages ...
	Ulčar, Matej; Robnik-Šikonja, Marko. - : arXiv, 2021
	BASE
	Show details

10	Slovenian RoBERTa contextual embeddings model: SloBERTa 1.0
	Ulčar, Matej; Robnik-Šikonja, Marko. - : Faculty of Computer and Information Science, University of Ljubljana, 2020
	BASE
	Show details

11	CroSloEngual BERT
	Ulčar, Matej; Robnik-Šikonja, Marko. - : Faculty of Computer and Information Science, University of Ljubljana, 2020
	BASE
	Show details

12	A Resource for Evaluating Graded Word Similarity in Context: CoSimLex
	Armendariz, Carlos; Matthew, Purver; Ulčar, Matej. - : Queen Mary University, 2020
	BASE
	Show details

13	CroSloEngual BERT 1.1
	Ulčar, Matej; Robnik-Šikonja, Marko. - : Faculty of Computer and Information Science, University of Ljubljana, 2020
	BASE
	Show details

14	FinEst BERT and CroSloEngual BERT: less is more in multilingual models ...
	Ulčar, Matej; Robnik-Šikonja, Marko. - : arXiv, 2020
	BASE
	Show details

15	Multilingual Culture-Independent Word Analogy Datasets ...
	Ulčar, Matej; Vaik, Kristiina; Lindström, Jessica. - : Zenodo, 2020
	BASE
	Show details

16	Multilingual Culture-Independent Word Analogy Datasets ...
	Ulčar, Matej; Vaik, Kristiina; Lindström, Jessica. - : Zenodo, 2020
	BASE
	Show details

17	SemEval-2020 Task 3: Graded Word Similarity in Context
	Santos Armendariz, Carlos; Purver, Matthew; Pollak, Senja. - : International Committee for Computational Linguistics, 2020. : https://www.aclweb.org/anthology/2020.semeval-1.3, 2020. : Proceedings of the 14th International Workshop on Semantic Evaluation (SemEval 2020), 2020
	BASE
	Show details

18	ELMo embeddings models for seven languages
	Ulčar, Matej. - : Faculty of Computer and Information Science, University of Ljubljana, 2019
	Abstract: ELMo language model (https://github.com/allenai/bilm-tf) used to produce contextual word embeddings, trained on large monolingual corpora for 7 languages: Slovenian, Croatian, Finnish, Estonian, Latvian, Lithuanian and Swedish. Each language's model was trained for approximately 10 epochs. Corpora sizes used in training range from over 270 M tokens in Latvian to almost 2 B tokens in Croatian. About 1 million most common tokens were provided as vocabulary during the training for each language model. The model can also infer OOV words, since the neural network input is on the character level. Each model is in its own .tar.gz archive, consisting of two files: pytorch weights (.hdf5) and options (.json). Both are needed for model inference, using allennlp (https://github.com/allenai/allennlp/blob/master/tutorials/how_to/elmo.md) python library.
	Keyword: contextual embeddings; Croatian language; ELMo; Estonian language; Finnish language; Latvian language; Lithuanian language; Slovenian language; Swedish language; word embeddings
	URL: http://hdl.handle.net/11356/1277
	BASE
	Hide details

19	Multilingual Culture-Independent Word Analogy Datasets
	Ulčar, Matej; Vaik, Kristiina; Lindström, Jessica. - : Faculty of Computer and Information Science, University of Ljubljana, 2019
	BASE
	Show details

20	ELMo embeddings model, Slovenian
	Ulčar, Matej. - : Faculty of Computer and Information Science, University of Ljubljana, 2019
	BASE
	Show details

Page: 1 2

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern