Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 8 of 8

1	Combining Deep Generative Models and Multi-lingual Pretraining for Semi-supervised Document Classification ...
	Zhu, Yi; Shareghi, Ehsan; Li, Yingzhen. - : arXiv, 2021
	BASE
	Show details

2	It Is Not As Good As You Think! Evaluating Simultaneous Machine Translation on Interpretation Data ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Gholamreza; Arthur, Philip. - : Underline Science Inc., 2021
	BASE
	Show details

3	A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters ...
	The Joint Conference of the 59th Annual Meeting of the Association for Computational Linguistics and the 11th International Joint Conference on Natural Language Processing 2021; ., Hinrich; Korhonen, Anna. - : Underline Science Inc., 2021
	BASE
	Show details

4	Self-Alignment Pretraining for Biomedical Entity Representations
	Liu, Fangyu; Shareghi, Ehsan; Meng, Zaiqiao; Basaldella, Marco; Collier, Nigel. - : Association for Computational Linguistics, 2021. : Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies, 2021
	Abstract: Despite the widespread success of self-supervised learning via masked language models (MLM), accurately capturing fine-grained semantic relationships in the biomedical domain remains a challenge. This is of paramount importance for entity-level tasks such as entity linking where the ability to model entity relations (especially synonymy) is pivotal. To address this challenge, we propose SapBERT, a pretraining scheme that self-aligns the representation space of biomedical entities. We design a scalable metric learning framework that can leverage UMLS, a massive collection of biomedical ontologies with 4M+ concepts. In contrast with previous pipeline-based hybrid systems, SapBERT offers an elegant one-model-for-all solution to the problem of medical entity linking (MEL), achieving a new state-of-the-art (SOTA) on six MEL benchmarking datasets. In the scientific domain, we achieve SOTA even without task-specific supervision. With substantial improvement over various domain-specific pretrained MLMs such as BioBERT, SciBERTand and PubMedBERT, our pretraining scheme proves to be both effective and robust. ; FL is supported by Grace & Thomas C.H. Chan Cambridge Scholarship. NC and MB would like to acknowledge funding from Health Data Research UK as part of the National Text Analytics project.
	URL: https://doi.org/10.17863/CAM.72095 https://www.repository.cam.ac.uk/handle/1810/324645
	BASE
	Hide details

5	A Closer Look at Few-Shot Crosslingual Transfer: The Choice of Shots Matters ...
	Zhao, Mengjie; Zhu, Yi; Shareghi, Ehsan. - : arXiv, 2020
	BASE
	Show details

6	Show Some Love to Your n-grams: A Bit of Progress and Stronger n-gram Language Modeling Baselines ...
	Shareghi, Ehsan; Gerz, Daniela; Vulic, Ivan. - : Apollo - University of Cambridge Repository, 2019
	BASE
	Show details

7	Fast, Small and Exact: Infinite-order Language Modelling with Compressed Suffix Trees ...
	Shareghi, Ehsan; Petri, Matthias; Haffari, Gholamreza. - : arXiv, 2016
	BASE
	Show details

8	Structured Prediction of Sequences and Trees using Infinite Contexts ...
	Shareghi, Ehsan; Haffari, Gholamreza; Cohn, Trevor. - : arXiv, 2015
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern