1 |
Shapley Idioms: Analysing BERT Sentence Embeddings for General Idiom Token Identification
|
|
|
|
In: Front Artif Intell (2022)
|
|
BASE
|
|
Show details
|
|
2 |
English WordNet Taxonomic Random Walk Pseudo-Corpora
|
|
|
|
In: Conference papers (2020)
|
|
BASE
|
|
Show details
|
|
3 |
Language related issues for machine translation between closely related south Slavic languages
|
|
|
|
BASE
|
|
Show details
|
|
4 |
Synthetic, Yet Natural: Properties of WordNet Random Walk Corpora and the impact of rare words on embedding performance
|
|
|
|
In: Conference papers (2019)
|
|
BASE
|
|
Show details
|
|
5 |
Size Matters: The Impact of Training Size in Taxonomically-Enriched Word Embeddings
|
|
|
|
In: Articles (2019)
|
|
BASE
|
|
Show details
|
|
7 |
Quantitative Fine-Grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Quantitative Fine-grained Human Evaluation of Machine Translation Systems: a Case Study on English to Croatian
|
|
|
|
In: Articles (2018)
|
|
BASE
|
|
Show details
|
|
9 |
Is it worth it? Budget-related evaluation metrics for model selection
|
|
|
|
In: Conference papers (2018)
|
|
BASE
|
|
Show details
|
|
10 |
hr500k – A Reference Training Corpus of Croatian.
|
|
|
|
In: Conference papers (2018)
|
|
BASE
|
|
Show details
|
|
15 |
Fine-grained human evaluation of neural versus phrase-based machine translation ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Fine-Grained Human Evaluation of Neural Versus Phrase-Based Machine Translation
|
|
|
|
In: Prague Bulletin of Mathematical Linguistics , Vol 108, Iss 1, Pp 121-132 (2017) (2017)
|
|
BASE
|
|
Show details
|
|
19 |
Serbian web corpus srWaC 1.1
|
|
|
|
Abstract:
The Serbian web corpus srWaC was built by crawling the .rs top-level domain in 2014. The corpus was near-deduplicated on paragraph level, normalised via diacritic restoration, morphosyntactically annotated and lemmatised. The corpus is shuffled by paragraphs. Each paragraph contains metadata on the URL, domain and language identification (Serbian vs. Croatian). Version 1.0 of this corpus is described in http://www.aclweb.org/anthology/W14-0405. Version 1.1 contains newer and better linguistic annotations.
|
|
Keyword:
lemmatisation; web corpus
|
|
URL: http://hdl.handle.net/11356/1063
|
|
BASE
|
|
Hide details
|
|
|
|