Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 8 of 8

1	Between words and characters: A Brief History of Open-Vocabulary Modeling and Tokenization in NLP
	Mielke, Sabrina J.; Alyafeai, Zaid; Salesky, Elizabeth...
	In: https://hal.inria.fr/hal-03540069 ; 2022 (2022)
	BASE
	Show details

2	SIGMORPHON 2020 Shared Task 0: Typologically Diverse Morphological Inflection ...
	Vylomova, Ekaterina; White, Jennifer; Salesky, Elizabeth. - : arXiv, 2020
	BASE
	Show details

3	SIGTYP 2020 Shared Task: Prediction of Typological Features ...
	Bjerva, Johannes; Salesky, Elizabeth; Mielke, Sabrina J.. - : arXiv, 2020
	BASE
	Show details

4	Linguistic calibration through metacognition: aligning dialogue agent responses with expected correctness ...
	Mielke, Sabrina J.; Szlam, Arthur; Boureau, Y-Lan. - : arXiv, 2020
	BASE
	Show details

5	Processing South Asian Languages Written in the Latin Script: the Dakshina Dataset ...
	Roark, Brian; Wolf-Sonkin, Lawrence; Kirov, Christo. - : arXiv, 2020
	BASE
	Show details

6	Spell Once, Summon Anywhere: A Two-Level Open-Vocabulary Language Model ...
	Mielke, Sabrina J.; Eisner, Jason. - : arXiv, 2018
	BASE
	Show details

7	Are All Languages Equally Hard to Language-Model? ...
	Cotterell, Ryan; Mielke, Sabrina J.; Eisner, Jason. - : arXiv, 2018
	BASE
	Show details

8	Unsupervised Disambiguation of Syncretism in Inflected Lexicons ...
	Cotterell, Ryan; Kirov, Christo; Mielke, Sabrina J.; Eisner, Jason. - : arXiv, 2018
	Abstract: Lexical ambiguity makes it difficult to compute various useful statistics of a corpus. A given word form might represent any of several morphological feature bundles. One can, however, use unsupervised learning (as in EM) to fit a model that probabilistically disambiguates word forms. We present such an approach, which employs a neural network to smoothly model a prior distribution over feature bundles (even rare ones). Although this basic model does not consider a token's context, that very property allows it to operate on a simple list of unigram type counts, partitioning each count among different analyses of that unigram. We discuss evaluation metrics for this novel task and report results on 5 languages. ... : Published at NAACL 2018 ...
	Keyword: Computation and Language cs.CL; FOS Computer and information sciences
	URL: https://arxiv.org/abs/1806.03740 https://dx.doi.org/10.48550/arxiv.1806.03740
	BASE
	Hide details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern