Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 7 of 7

1	Towards the Next 1000 Languages in Multilingual Machine Translation: Exploring the Synergy Between Supervised and Self-Supervised Learning ...
	Siddhant, Aditya; Bapna, Ankur; Firat, Orhan. - : arXiv, 2022
	BASE
	Show details

2	Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets
	Caswell, Isaac; Kreutzer, Julia; Wang, Lisa...
	In: https://hal.inria.fr/hal-03177623 ; 2021 (2021)
	BASE
	Show details

3	Quality at a Glance: An Audit of Web-Crawled Multilingual Datasets ...
	Kreutzer, Julia; Caswell, Isaac; Wang, Lisa. - : arXiv, 2021
	BASE
	Show details

4	Language ID in the Wild: Unexpected Challenges on the Path to a Thousand-Language Web Text Corpus ...
	Caswell, Isaac; Breiner, Theresa; van Esch, Daan. - : arXiv, 2020
	BASE
	Show details

5	BLEU might be Guilty but References are not Innocent ...
	Freitag, Markus; Grangier, David; Caswell, Isaac. - : arXiv, 2020
	BASE
	Show details

6	Investigating Multilingual NMT Representations at Scale ...
	Kudugunta, Sneha Reddy; Bapna, Ankur; Caswell, Isaac. - : arXiv, 2019
	BASE
	Show details

7	Translationese as a Language in "Multilingual" NMT ...
	Riley, Parker; Caswell, Isaac; Freitag, Markus; Grangier, David. - : arXiv, 2019
	Abstract: Machine translation has an undesirable propensity to produce "translationese" artifacts, which can lead to higher BLEU scores while being liked less by human raters. Motivated by this, we model translationese and original (i.e. natural) text as separate languages in a multilingual model, and pose the question: can we perform zero-shot translation between original source text and original target text? There is no data with original source and original target, so we train sentence-level classifiers to distinguish translationese from original target text, and use this classifier to tag the training data for an NMT model. Using this technique we bias the model to produce more natural outputs at test time, yielding gains in human evaluation scores on both accuracy and fluency. Additionally, we demonstrate that it is possible to bias the model to produce translationese and game the BLEU score, increasing it while decreasing human-rated quality. We analyze these models using metrics to measure the degree of ...
	Keyword: Computation and Language cs.CL; FOS Computer and information sciences
	URL: https://arxiv.org/abs/1911.03823 https://dx.doi.org/10.48550/arxiv.1911.03823
	BASE
	Hide details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern