Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6

Hits 41 – 60 of 106

41	Speech Controlled Computing ...
	Cieri, Christopher; Miller, David; Martey, Nii O.. - : Linguistic Data Consortium, 2006
	BASE
	Show details

42	Switchboard Cellular Part 2 Audio
	Graff, David; Walker, Kevin; Miller, David. - : Linguistic Data Consortium, 2004. : https://www.ldc.upenn.edu, 2004
	BASE
	Show details

43	Switchboard Cellular Part 2 Audio ...
	Graff, David; Walker, Kevin; Miller, David. - : Linguistic Data Consortium, 2004
	BASE
	Show details

44	Conversational Telephone Speech Corpus Collection for the NIST Speaker Recognition Evaluation 2004
	Martin, Alvin; Miller, David; Przybocki, Mark...
	In: DTIC (2004)
	BASE
	Show details

45	The Mixer Corpus of Multilingual, Multichannel Speaker Recognition Data
	Cieri, Christopher; Campbell, Joseph P.; Nakasone, Hirotaka...
	In: DTIC (2004)
	BASE
	Show details

46	Multiple-Translation Chinese (MTC) Part 2
	Huang, Shudong; Graff, David; Walker, Kevin. - : Linguistic Data Consortium, 2003. : https://www.ldc.upenn.edu, 2003
	BASE
	Show details

47	Multiple-Translation Arabic (MTA) Part 1
	Walker, Kevin; Bamba, Moussa; Miller, David; Ma, Xiaoyi; Cieri, Christopher; Doddington, George R.. - : Linguistic Data Consortium, 2003. : https://www.ldc.upenn.edu, 2003
	Abstract: Introduction Multiple-Translation Arabic (MTA) Part 1 was developed by the Linguistic Data Consortium (LDC) and contains approximately 23,000 words of Arabic news text with 13 sets of English translations, 10 by humans and three by automatic Machine Translation (MT) systems, and assessments of the MT. To support the development of automatic means for evaluating translation quality, the LDC was sponsored to solicit ten sets of human translations for a single set of Arabic source materials. The LDC was also asked to produce translations from various commercial-off-the-shelf-systems (COTS, including commercial MT systems and ones available on the Internet). There are a total of two sets of COTS outputs and one output set from a TIDES 2002 MT Evaluation participant, which is representative for the state-of-the-art research systems. The goal of this effort is to evaluate the quality of TIDES research, human translation teams and commercial off-the shelf (COTS) systems. To see if automatic evaluation systems such as BLEU track human assessment, the LDC has also performed human assessment on the two COTS outputs and the TIDES research system. The corpus includes the assessment results for one of the two COTS systems, the assessment results for the TIDES research system, and the specifications used for conducting the assessments. This corpus represents the first part of a collection of multiple-translation Arabic. The second part is available from LDC as Multiple-Translation Arabic (MTA) Part 2 (LDC2005T05). Data Here's a breakdown of the data amounts by source contained in this corpus: Source Abbreviation Stories Words Xinhua News Service Xinhua 66 11,155 Agence France Presse AFP 75 12,674 Totals 141 23,829 There are 141 source files, and 1,792 translation files (12 of the 13 systems produced translations for all 141 source files, while one system produced translations for only 100 of the 141 Arabic stories). The story selection from the two newswire collections was controlled by story length: all selected stories contain between 700 and 1,500 Arabic characters. The Xinhua data was drawn from the Xinhua News Agency's Arabic newswire feed in October 2001. The AFP data was drawn from LDC's Arabic Newswire Part 1 (LDC2001T55). The MT outputs were evaluated on the basis of adequacy and fluency, using the human translations as the gold standard. Adequacy refers to the degree to which the translation communicates information present in the original source language text. Fluency refers to the degree to which the translation is well-formed according to the grammar of the target language. The human translation teams initially submitted six stories, which were returned with feedback before being assigned the rest of the material. Further submissions were continually monitored for quality. Ranking of manual translations was performed by two LDC staff members, one an Arabic-dominant bilingual and the other an English native monolingual. There was overall agreement between the two and minor discrepancies were resolved through discussion and comparison of additional files. The ranking method was unstructured and somewhat casual -- it is not intended to be definitive, or even accountable. The original source files used CP-1256 encoding for the Arabic characters, and SGML tags for marking sentence and paragraph boundaries and other information about each story. The source files were later converted to UTF8 encoding. To make things easier for the translators, nearly all SGML tags were removed or replaced by "plain text" markers. The assessment is presented in a .txt file with comma separated fields containing judgements and identification info. Samples For an example of the data in this corpus, please view this source sample (SGML) and its translation (SGML). Updates None at this time.
	URL: https://catalog.ldc.upenn.edu/LDC2003T18
	BASE
	Hide details

48	Multiple-Translation Chinese (MTC) Part 2 ...
	Huang, Shudong; Graff, David; Walker, Kevin. - : Linguistic Data Consortium, 2003
	BASE
	Show details

49	Multiple-Translation Arabic (MTA) Part 1 ...
	Walker, Kevin; Bamba, Moussa; Miller, David. - : Linguistic Data Consortium, 2003
	BASE
	Show details

50	Context-Based Machine Translation ...
	Carbonell, Jaime G.; Klein, Steve; Miller, David. - : Carnegie Mellon University, 2002
	BASE
	Show details

51	Context-Based Machine Translation ...
	Carbonell, Jaime G.; Klein, Steve; Miller, David. - : Carnegie Mellon University, 2002
	BASE
	Show details

52	1997 HUB5 Arabic Evaluation
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002. : https://www.ldc.upenn.edu, 2002
	BASE
	Show details

53	2001 HUB5 Mandarin Evaluation
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002. : https://www.ldc.upenn.edu, 2002
	BASE
	Show details

54	1998 HUB5 English Evaluation
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002. : https://www.ldc.upenn.edu, 2002
	BASE
	Show details

55	2001 HUB5 English Evaluation
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002. : https://www.ldc.upenn.edu, 2002
	BASE
	Show details

56	Switchboard-2 Phase III Audio
	Graff, David; Miller, David; Walker, Kevin. - : Linguistic Data Consortium, 2002. : https://www.ldc.upenn.edu, 2002
	BASE
	Show details

57	1998 HUB5 English Evaluation ...
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002
	BASE
	Show details

58	2001 HUB5 English Evaluation ...
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002
	BASE
	Show details

59	1997 HUB5 Arabic Evaluation ...
	Graff, David; Martin, Alvin; Miller, David. - : Linguistic Data Consortium, 2002
	BASE
	Show details

60	Switchboard-2 Phase III Audio ...
	Graff, David; Miller, David; Walker, Kevin. - : Linguistic Data Consortium, 2002
	BASE
	Show details

Page: 1 2 3 4 5 6

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern