Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3

Hits 1 – 20 of 57

1	2011 NIST Language Recognition Evaluation Test Set
	Greenberg, Craig; Martin, Alvin; Graff, David; Walker, Kevin; Jones, Karen; Strassel, Stephanie. - : Linguistic Data Consortium, 2018. : https://www.ldc.upenn.edu, 2018
	Abstract: Introduction 2011 NIST Language Recognition Evaluation Test Set was developed by the Linguistic Data Consortium (LDC) and the National Institute of Standards and Technology (NIST). It contains selected training data and the evaluation test set for the 2011 NIST Language Recognition Evaluation, approximately 204 hours of conversational telephone speech and broadcast audio collected by LDC in the following 24 languages and dialects: Arabic (Iraqi), Arabic (Levantine), Arabic (Maghrebi), Arabic (Standard), Bengali, Czech, Dari, English (American), English (Indian), Farsi, Hindi, Lao, Mandarin, Punjabi, Pashto, Polish, Russian, Slovak, Spanish, Tamil, Thai, Turkish, Ukrainian and Urdu. The goal of NIST's Language Recognition Evaluation (LRE) is to establish the baseline of current performance capability for language recognition of conversational telephone speech and to lay the groundwork for further research efforts in the field. NIST conducted language recognition evaluations in 1996, 2003, 2005, 2007, and 2009. The 2011 evaluation emphasized the language pair condition and involved both conversational telephone speech (CTS) and broadcast narrow-band speech (BNBS). Further information regarding this evaluation can be found in the evaluation plan which is also included in the documentation for this release. LDC released the prior LREs as: * 2003 NIST Language Recognition Evaluation (LDC2006S31) * 2005 NIST Language Recognition Evaluation (LDC2008S05) * 2007 NIST Language Recognition Evaluation Test Set (LDC2009S04) * 2007 NIST Language Recognition Evaluation Supplemental Training Set (LDC2009S05) * 2009 NIST Language Recognition Evaluation Test Set (LDC2014S06) Data This release includes training data for nine language varieties that had not been represented in prior LRE cycles -- Arabic (Iraqi), Arabic (Levantine), Arabic (Maghrebi), Arabic (Standard), Czech, Lao, Punjabi, Polish, and Slovak -- contained in 893 audited segments of roughly 30 seconds duration and in 400 full-length CTS recordings. The evaluation test set comprises a total of 29,511 audio files, all manually audited at LDC for language and divided equally into three different test conditions according to the nominal amount of speech content per segment. Data was collected between 2009 and 2011, and has been released by LDC as individual corpora grouped by language. The CTS data was obtained using a "claque" collection model in which speakers (claques) called friends or relatives in their social network for a 10-minute conversation in the claque's native language, such that each call would involve a unique callee. Participants were free to speak on topics of their own choosing. All calls were routed through a telephone collection system at LDC which stored the raw mu-law sample stream into separate audio files for each call side. Auditing and selection were applied to the callee side of every call and to the caller (claque) side in at most one call made by each claque. Contiguous regions containing between 25 and 35 seconds of speech were identified by signal analysis and extracted for manual audit. In some cases, shorter segments were also selected for audit. Broadcast audio was recorded via capture of satellite-receiver MPEG streams or analog audio receivers digitizing at 16 kHz. Platforms for data capture were located at LDC and in Tunisia and India. Recordings were analyzed to extract contiguous segments of narrow-band speech of at least 33 seconds duration; longer segments were trimmed to a maximum length of 35 seconds for audit. All audited segments for training and test are presented as 8-kHz, 16-bit PCM, single-channel audio files with NIST SPHERE headers. The full-length CTS data is the same, except that it consists of two channels. Samples For examples of the data in this corpus, please listen to this Urdu sample (SPH), Pashto sample (SPH), and English sample (SPH). Updates None at this time.
	URL: https://catalog.ldc.upenn.edu/LDC2018S06
	BASE
	Hide details

2	2011 NIST Language Recognition Evaluation Test Set ...
	Greenberg, Craig; Martin, Alvin; Graff, David. - : Linguistic Data Consortium, 2018
	BASE
	Show details

3	2010 NIST Speaker Recognition Evaluation Test Set
	Greenberg, Craig; Martin, Alvin; Graff, David. - : Linguistic Data Consortium, 2017. : https://www.ldc.upenn.edu, 2017
	BASE
	Show details

4	2010 NIST Speaker Recognition Evaluation Test Set ...
	Greenberg, Craig; Martin, Alvin; Graff, David. - : Linguistic Data Consortium, 2017
	BASE
	Show details

5	2009 NIST Language Recognition Evaluation Test Set
	Martin, Alvin; Greenberg, Craig; Graff, David. - : Linguistic Data Consortium, 2014. : https://www.ldc.upenn.edu, 2014
	BASE
	Show details

6	2009 NIST Language Recognition Evaluation Test Set ...
	Martin, Alvin; Greenberg, Craig; Graff, David. - : Linguistic Data Consortium, 2014
	BASE
	Show details

7	2007 NIST Language Recognition Evaluation Test Set
	Martin, Alvin; Le, Audrey. - : Linguistic Data Consortium, 2009. : https://www.ldc.upenn.edu, 2009
	BASE
	Show details

8	2007 NIST Language Recognition Evaluation Supplemental Training Set
	Martin, Alvin; Le, Audrey; Graff, David. - : Linguistic Data Consortium, 2009. : https://www.ldc.upenn.edu, 2009
	BASE
	Show details

9	2007 NIST Language Recognition Evaluation Supplemental Training Set ...
	Martin, Alvin; Le, Audrey; Graff, David. - : Linguistic Data Consortium, 2009
	BASE
	Show details

10	2007 NIST Language Recognition Evaluation Test Set ...
	Martin, Alvin; Le, Audrey. - : Linguistic Data Consortium, 2009
	BASE
	Show details

11	2005 NIST Language Recognition Evaluation
	Le, Audrey; Martin, Alvin; Hadfield, Hannah. - : Linguistic Data Consortium, 2008. : https://www.ldc.upenn.edu, 2008
	BASE
	Show details

12	2005 NIST Language Recognition Evaluation ...
	Le, Audrey; Martin, Alvin; Hadfield, Hannah. - : Linguistic Data Consortium, 2008
	BASE
	Show details

13	NIST speaker recognition evaluations utilizing the Mixer Corpora - 2004, 2005, 2006
	Przybocki, Mark A.; Le, Audrey N.; Martin, Alvin F.
	In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 15 (2007) 7, 1951-1959
	BLLDB
	OLC Linguistik
	Show details

14	2004 Spring NIST Rich Transcription (RT-04S) Development Data
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007. : https://www.ldc.upenn.edu, 2007
	BASE
	Show details

15	2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007. : https://www.ldc.upenn.edu, 2007
	BASE
	Show details

16	2004 Spring NIST Rich Transcription (RT-04S) Development Data ...
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007
	BASE
	Show details

17	2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data ...
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007
	BASE
	Show details

18	NIST and NFI-TNO evaluations of automatic speaker recognition
	Leeuwen, David A. van; Martin, Alvin F.; Przybocki, Mark A....
	In: Computer speech and language. - Amsterdam [u.a.] : Elsevier 20 (2006) 2-3, 128-158
	BLLDB
	Show details

19	2003 NIST Language Recognition Evaluation
	Martin, Alvin; Pryzbocki, Mark. - : Linguistic Data Consortium, 2006. : https://www.ldc.upenn.edu, 2006
	BASE
	Show details

20	2004 NIST Speaker Recognition Evaluation
	Martin, Alvin; Przybocki, Mark. - : Linguistic Data Consortium, 2006. : https://www.ldc.upenn.edu, 2006
	BASE
	Show details

Page: 1 2 3

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern