Home Catalogue search

eng

Refine your search:
- Keyword
- Creator / Publisher
- Year
- Medium
- Type
- BLLDB-Access:
  - free (13)
  - subject to license (0)

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 13 of 13

1	Pathologies of Pre-trained Language Models in Few-shot Fine-tuning ...
	Chen, Hanjie; Zheng, Guoqing; Awadallah, Ahmed Hassan. - : arXiv, 2022
	BASE
	Show details

2	MetaXL: Meta Representation Transformation for Low-resource Cross-lingual Learning ...
	Xia, Mengzhou; Zheng, Guoqing; Mukherjee, Subhabrata. - : arXiv, 2021
	BASE
	Show details

3	A Dataset and Baselines for Multilingual Reply Suggestion ...
	Zhang, Mozhi; Wang, Wei; Deb, Budhaditya. - : arXiv, 2021
	BASE
	Show details

4	A Conditional Generative Matching Model for Multi-lingual Reply Suggestion ...
	Deb, Budhaditya; Zheng, Guoqing; Shokouhi, Milad. - : arXiv, 2021
	BASE
	Show details

5	XtremeDistilTransformers: Task Transfer for Task-agnostic Distillation ...
	Mukherjee, Subhabrata; Awadallah, Ahmed Hassan; Gao, Jianfeng. - : arXiv, 2021
	BASE
	Show details

6	Say `YES' to Positivity: Detecting Toxic Language in Workplace Communications ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Awadallah, Ahmed; Bennett, Paul. - : Underline Science Inc., 2021
	BASE
	Show details

7	A Conditional Generative Matching Model for Multi-lingual Reply Suggestion ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Awadallah, Ahmed; Deb, Budhaditya. - : Underline Science Inc., 2021
	BASE
	Show details

8	Gender Bias in Multilingual Embeddings and Cross-Lingual Transfer ...
	Zhao, Jieyu; Mukherjee, Subhabrata; Hosseini, Saghar. - : arXiv, 2020
	BASE
	Show details

9	XtremeDistil: Multi-stage Distillation for Massive Multilingual Models ...
	Mukherjee, Subhabrata; Awadallah, Ahmed. - : arXiv, 2020
	Abstract: Deep and large pre-trained language models are the state-of-the-art for various natural language processing tasks. However, the huge size of these models could be a deterrent to use them in practice. Some recent and concurrent works use knowledge distillation to compress these huge models into shallow ones. In this work we study knowledge distillation with a focus on multi-lingual Named Entity Recognition (NER). In particular, we study several distillation strategies and propose a stage-wise optimization scheme leveraging teacher internal representations that is agnostic of teacher architecture and show that it outperforms strategies employed in prior works. Additionally, we investigate the role of several factors like the amount of unlabeled data, annotation resources, model architecture and inference latency to name a few. We show that our approach leads to massive compression of MBERT-like teacher models by upto 35x in terms of parameters and 51x in terms of latency for batch inference while retaining 95% ... : To appear in ACL 2020 ...
	Keyword: Computation and Language cs.CL; FOS Computer and information sciences; Machine Learning cs.LG
	URL: https://arxiv.org/abs/2004.05686 https://dx.doi.org/10.48550/arxiv.2004.05686
	BASE
	Hide details

10	Smart To-Do : Automatic Generation of To-Do Items from Emails ...
	Mukherjee, Sudipto; Mukherjee, Subhabrata; Hasegawa, Marcello. - : arXiv, 2020
	BASE
	Show details

11	Distilling BERT into Simple Neural Networks with Unlabeled Transfer Data ...
	Mukherjee, Subhabrata; Awadallah, Ahmed Hassan. - : arXiv, 2019
	BASE
	Show details

12	Multi-Source Cross-Lingual Model Transfer: Learning What to Share ...
	Chen, Xilun; Awadallah, Ahmed Hassan; Hassan, Hany. - : arXiv, 2018
	BASE
	Show details

13	Identifying Roles in Social Networks using Linguistic Analysis.
	Awadallah, Ahmed Mohammed Hassan
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern