Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 16 of 16

1	Evaluating Multiway Multilingual NMT in the Turkic Languages ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Mirzakhalov, Jamshidbek. - : Underline Science Inc., 2021
	BASE
	Show details

2	Findings of the WMT 2021 Shared Task on Quality Estimation ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Blain, Frédéric; Chaudhary, Vishrav. - : Underline Science Inc., 2021
	BASE
	Show details

3	Pushing the Right Buttons: Adversarial Evaluation of Quality Estimation ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Tharindu; Blain, Frédéric. - : Underline Science Inc., 2021
	BASE
	Show details

4	Multilingual Domain Adaptation for NMT: Decoupling Language and Domain Information with Adapters ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Berard, Alexandre; Cooper Stickland, Asa. - : Underline Science Inc., 2021
	BASE
	Show details

5	Robust Open-Vocabulary Translation from Visual Text Representations ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Etter, Dave; Post, Matt. - : Underline Science Inc., 2021
	BASE
	Show details

6	Contrastive Learning for Context-aware Neural Machine Translation Using Coreference Information ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Kyomin; Hwang, Yongkeun. - : Underline Science Inc., 2021
	BASE
	Show details

7	To Ship or Not to Ship: An Extensive Evaluation of Automatic Metrics for Machine Translation ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Kocmi, Tom. - : Underline Science Inc., 2021
	BASE
	Show details

8	Identifying the Importance of Content Overlap for Better Cross-lingual Embedding Mappings ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Berend, Gábor; Cserháti, Réka. - : Underline Science Inc., 2021
	BASE
	Show details

9	Simultaneous Neural Machine Translation with Constituent Label Prediction ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Kano, Yasumasa; Nakamura, Satoshi. - : Underline Science Inc., 2021
	BASE
	Show details

10	Just Ask! Evaluating Machine Translation by Asking and Answering Questions ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Marie-Francine; Ghadery, Erfan. - : Underline Science Inc., 2021
	BASE
	Show details

11	An Analysis of Euclidean vs. Graph-Based Framing for Bilingual Lexicon Induction from Word Embedding Spaces ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; ., Ali; Alyakin, Anyon. - : Underline Science Inc., 2021
	BASE
	Show details

12	Findings of the WMT Shared Task on Machine Translation Using Terminologies ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Alam, Md Mahfuz Ibn; Kvapilikova, Ivana. - : Underline Science Inc., 2021
	BASE
	Show details

13	Translation Transformers Rediscover Inherent Data Domains ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Del, Maksym; Fishel, Mark. - : Underline Science Inc., 2021
	BASE
	Show details

14	Phrase-level Active Learning for Neural Machine Translation ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Hu, Junjie; Neubig, Graham. - : Underline Science Inc., 2021
	BASE
	Show details

15	A Fine-Grained Analysis of BERTScore ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Bojar, Ondřej; Hanna, Michael. - : Underline Science Inc., 2021
	BASE
	Show details

16	Wine is not v i n. On the Compatibility of Tokenizations across Languages ...
	The 2021 Conference on Empirical Methods in Natural Language Processing 2021; Dufter, Philipp; Maronikolakis, Antonis. - : Underline Science Inc., 2021
	Abstract: The size of the vocabulary is a central design choice in large pretrained language models, with respect to both performance and memory requirements. Typically, subword tokenization algorithms such as byte pair encoding and WordPiece are used. In this work, we investigate the compatibility of tokenizations for multilingual static and contextualized embedding spaces and propose a measure that reflects the compatibility of tokenizations across languages. Our goal is to prevent incompatible tokenizations, e.g., "wine" (word-level) in English vs. "v i n" (character-level) in French, which make it hard to learn good multilingual semantic representations. We show that our compatibility measure allows the system designer to create vocabularies across languages that are compatible -- a desideratum that so far has been neglected in multilingual models. ...
	Keyword: Bilingual Lexicon Induction; Language Models; Natural Language Processing
	URL: https://dx.doi.org/10.48448/4bn9-4p23 https://underline.io/lecture/38413-wine-is-not-v-i-n.-on-the-compatibility-of-tokenizations-across-languages
	BASE
	Hide details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern