Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 4 of 4

1	Investigating language impact in bilingual approaches for computational language documentation
	Boito, M.Z.; Villavicencio, A.; Besacier, L.. - : Special Interest Group: Under-resourced Languages (SIGUL), 2020
	BASE
	Show details

2	Empirical evaluation of sequence-to-sequence models for word discovery in low-resource settings
	Boito, M.Z.; Villavicencio, A.; Besacier, L.. - : International Speech Communication Association (ISCA), 2019
	BASE
	Show details

3	Unsupervised word segmentation from speech with attention
	Godard, P.; Boito, M.Z.; Ondel, L.. - : ISCA, 2018
	BASE
	Show details

4	Unwritten languages demand attention too! Word discovery with encoder-decoder models
	Boito, M.Z.; Bérard, A.; Villavicencio, A.; Besacier, L.. - : IEEE, 2018
	Abstract: Word discovery is the task of extracting words from un-segmented text. In this paper we examine to what extent neural networks can be applied to this task in a realistic unwritten language scenario, where only small corpora and limited annotations are available. We investigate two scenarios: one with no supervision and another with limited supervision with access to the most frequent words. Obtained results show that it is possible to retrieve at least 27% of the gold standard vocabulary by training an encoder-decoder neural machine translation system with only 5,157 sentences. This result is close to those obtained with a task-specific Bayesian nonparametric model. Moreover, our approach has the advantage of generating translation alignments, which could be used to create a bilingual lexicon. As a future perspective, this approach is also well suited to work directly from speech.
	URL: http://eprints.whiterose.ac.uk/153555/
	BASE
	Hide details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern