Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4

Hits 41 – 60 of 72

41	Universal Dependencies 2.1
	Nivre, Joakim; Agić, Željko; Ahrenberg, Lars. - : Universal Dependencies Consortium, 2017
	BASE
	Show details

42	CoNLL 2017 Shared Task: Multilingual Parsing from Raw Text to Universal Dependencies
	Çöltekin, Çağrı; Kayadelen, Tolga; Droganova, Kira. - : Association for Computational Linguistics, 2017. : country:USA, 2017. : place:Stroudsburg, PA, 2017
	BASE
	Show details

43	Universal Dependencies for the AnCora treebanks
	Martinez Alonso, Hector; Zeman, Daniel
	In: ISSN: 1135-5948 ; Procesamiento del Lenguaje Natural ; https://hal.inria.fr/hal-01426751 ; Procesamiento del Lenguaje Natural, Sociedad Espanola para el Procesamiento del Lenguaje Natural, 2016 ; http://journal.sepln.org/sepln/ojs/ojs/index.php/pln/issue/view/220 (2016)
	BASE
	Show details

44	Deltacorpus 1.1
	Mareček, David; Yu, Zhiwei; Zeman, Daniel. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2016
	BASE
	Show details

45	Open SDP
	Flickinger, Dan; Hajič, Jan; Ivanova, Angelina. - : Oslo University, 2016. : Charles University, 2016
	BASE
	Show details

46	Deltacorpus
	Mareček, David; Yu, Zhiwei; Zeman, Daniel. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2016
	BASE
	Show details

47	Universal Dependencies 1.4
	Nivre, Joakim; Agić, Željko; Ahrenberg, Lars. - : Universal Dependencies Consortium, 2016
	BASE
	Show details

48	Universal Dependencies 1.3
	Nivre, Joakim; Agić, Željko; Ahrenberg, Lars. - : Universal Dependencies Consortium, 2016
	BASE
	Show details

49	SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing
	Flickinger, Dan; Hajič, Jan; Ivanova, Angelina. - : Linguistic Data Consortium, 2016. : https://www.ldc.upenn.edu, 2016
	BASE
	Show details

50	SDP 2014 & 2015: Broad Coverage Semantic Dependency Parsing ...
	Flickinger, Dan; Hajič, Jan; Ivanova, Angelina. - : Linguistic Data Consortium, 2016
	BASE
	Show details

51	Universal Dependencies 1.1
	Agić, Željko; Aranzabe, Maria Jesus; Atutxa, Aitziber. - : Universal Dependencies Consortium, 2015
	BASE
	Show details

52	HamleDT 3.0
	Zeman, Daniel; Mareček, David; Mašek, Jan. - : Charles University, 2015
	BASE
	Show details

53	Universal Dependencies 1.2
	Nivre, Joakim; Agić, Željko; Aranzabe, Maria Jesus. - : Universal Dependencies Consortium, 2015
	BASE
	Show details

54	Lingua::Interset 2.026
	Zeman, Daniel. - : Charles University, Faculty of Mathematics and Physics, 2015
	BASE
	Show details

55	Universal Dependencies 1.0
	Nivre, Joakim; Bosco, Cristina; Choi, Jinho. - : Universal Dependencies Consortium, 2015
	BASE
	Show details

56	Czech Machine Translation in the project CzechMATE
	Bojar, Ondřej; Zeman, Daniel
	In: The Prague bulletin of mathematical linguistics. - Praha : Univ. (2014) 101, 71-96
	OLC Linguistik
	Show details

57	HindMonoCorp 0.5
	Bojar, Ondřej; Diatka, Vojtěch; Rychlý, Pavel; Straňák, Pavel; Suchomel, Vít; Tamchyna, Aleš; Zeman, Daniel. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	Abstract: Hindi monolingual corpus. It is based primarily on web crawls performed using various tools and at various times. Since the web is a living data source, we treat these crawls as completely separate sources, despite they may overlap. To estimate the magnitude of this overlap, we compared the total number of segments if we concatenate the individual sources (each source being deduplicated on its own) with the number of segments if we de-duplicate all sources to- gether. The difference is just around 1%, confirming, that various web crawls (or their subsequent processings) differ significantly. HindMonoCorp contains data from: Hindi web texts, a monolingual corpus containing mainly Hindi news articles has already been collected and released by Bojar et al. (2008). We use the HTML files as crawled for this corpus in 2010 and we add a small crawl performed in 2013 and re-process them with the current pipeline. These sources are denoted HWT 2010 and HWT 2013 in the following. Hindi corpora in W2C have been collected by Martin Majliš during his project to automatically collect corpora in many languages (Majliš and Žabokrtský, 2012). There are in fact two corpora of Hindi available—one from web harvest (W2C Web) and one from the Wikipedia (W2C Wiki). SpiderLing is a web crawl carried out during November and December 2013 using SpiderLing (Suchomel and Pomikálek, 2012). The pipeline includes extraction of plain texts and deduplication at the level of documents, see below. CommonCrawl is a non-profit organization that regu- larly crawls the web and provides anyone with the data. We are grateful to Christian Buck for extracting plain text Hindi segments from the 2012 and 2013-fall crawls for us. Intercorp – 7 books with their translations scanned and manually alligned per paragraph RSS Feeds from Webdunia.com and the Hindi version of BBC International followed by our custom crawler from September 2013 till January 2014. ; LM2010013
	Keyword: corpus
	URL: http://hdl.handle.net/11858/00-097C-0000-0023-6260-A
	BASE
	Hide details

58	HamleDT 2.0
	Zeman, Daniel; Mareček, David; Mašek, Jan. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	BASE
	Show details

59	HindEnCorp 0.5
	Bojar, Ondřej; Diatka, Vojtěch; Straňák, Pavel. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2014
	BASE
	Show details

60	Many Czech References for 50 Sentences Selected from WMT11 Data
	Bojar, Ondřej; Macháček, Matouš; Tamchyna, Aleš. - : Charles University, Faculty of Mathematics and Physics, Institute of Formal and Applied Linguistics (UFAL), 2013
	BASE
	Show details

Page: 1 2 3 4

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern