Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Hits 1 – 18 of 18

1	“There's no rules. It's hackathon.”: Negotiating Commitment in a Context of Volatile Sociality
	Le, Audrey; Jones, Graham M; Semel, Beth Michelle
	In: American Anthropological Association (2015)
	BASE
	Show details

2	2007 NIST Language Recognition Evaluation Test Set
	Martin, Alvin; Le, Audrey. - : Linguistic Data Consortium, 2009. : https://www.ldc.upenn.edu, 2009
	Abstract: Introduction 2007 NIST Language Recognition Evaluation Test Set was developed by the Linguistic Data Consortium (LDC) and the National Institute of Standards and Technology. It consists of 66 hours of conversational telephone speech segments in the following languages and dialects: Arabic, Bengali, Chinese (Cantonese), Mandarin Chinese (Mainland, Taiwan), Chinese (Min), English (American, Indian), Farsi, German, Hindustani (Hindi, Urdu), Korean, Russian, Spanish (Caribbean, non-Caribbean), Tamil, Thai, and Vietnamese. The goal of NIST's Language Recognition Evaluation (LRE) is to establish the baseline of current performance capability for language recognition of conversational telephone speech and to lay the groundwork for further research efforts in the field. NIST conducted three previous language recognition evaluations, in 1996, 2003 and 2005. The most significant differences between those evaluations and the 2007 task were the increased number of languages and dialects, the greater emphasis on a basic detection task for evaluation and the variety of evaluation conditions. Thus, in 2007, given a segment of speech and a language of interest to be detected (i.e., a target language), the task was to decide whether that target language was in fact spoken in the given telephone speech segment (yes or no), based on an automated analysis of the data contained in the segment. Further information regarding this evaluation can be found in the evaluation plan which is included in the documentation for this release. The training data for LRE 2007 consists of the following: * 2003 NIST Language Recognition Evaluation (LDC2006S31) - This material is comprised of: (1) approximately 46 hours of conversational telephone speech segments in the target languages and dialects and (2) the 1996 LRE test data (conversational telephone speech in Arabic (Egyptian colloquial), English (General American, Southern American), Farsi, French, German, Hindi, Japanese, Korean, Mandarin Chinese (Mainland, Taiwan), Spanish (Caribbean, non-Caribbean), Tamil, and Vietnamese. * 2005 NIST Language Recognition Evaluation (LDC2008S05) - This release consists of approximately 44 hours of conversational telephone speech in English (American, Indian), Hindi, Japanese, Korean, Mandarin Chinese (Mainland, Taiwan), Spanish (Mexican), and Tamil. * 2007 NIST Language Recognition Evaluation Supplemental Training Data (LDC2009S05) - This release consists of 118 hours of conversational telephone speech segments in Arabic (Egyptian colloquial), Bengali, Min Nan Chinese, Wu Chinese, Taiwan Mandarin, Cantonese, Russian, Mexican Spanish, Thai, Urdu, and Tamil. LDC released other LREs as: * 2003 NIST Language Recognition Evaluation (LDC2006S31) * 2005 NIST Language Recognition Evaluation (LDC2008S05) * 2007 NIST Language Recognition Evaluation Supplemental Training Data (LDC2009S05) * 2009 NIST Language Recognition Evaluation Test Set (LDC2014S06) * 2011 NIST Language Recognition Evaluation Test Set (LDC2018S06) Data Each speech file in the test data is one side of a 4-wire telephone conversation represented as 8-bit 8-kHz mu-law format. There are 7530 speech files in SPHERE (.sph) format for a total of 66 hours of speech. The speech data was compiled from LDCs CALLFRIEND, Fisher Spanish, and Mixer 3 corpora and from data collected by Oregon Health and Science University (OHSU), Beaverton, Oregon. The test segments contain three nominal durations of speech: three seconds, 10 seconds and 30 seconds. Actual speech durations vary, but were constrained to be within the ranges of 2-4 seconds, 7-13 seconds and 23-35 seconds, respectively. Non-speech portions of each segment were included in each segment so that a segment contained a continuous sample of the source recording. Therefore, the test segments may be significantly longer than the speech duration, depending on how much non-speech was included. Unlike previous evaluations, the nominal duration for each test segment was not identified. Samples For an example of the data in this corpus, please listen to this sample (WAV). Updates None at this time.
	URL: https://catalog.ldc.upenn.edu/LDC2009S04
	BASE
	Hide details

3	2007 NIST Language Recognition Evaluation Supplemental Training Set
	Martin, Alvin; Le, Audrey; Graff, David. - : Linguistic Data Consortium, 2009. : https://www.ldc.upenn.edu, 2009
	BASE
	Show details

4	2007 NIST Language Recognition Evaluation Supplemental Training Set ...
	Martin, Alvin; Le, Audrey; Graff, David. - : Linguistic Data Consortium, 2009
	BASE
	Show details

5	2007 NIST Language Recognition Evaluation Test Set ...
	Martin, Alvin; Le, Audrey. - : Linguistic Data Consortium, 2009
	BASE
	Show details

6	2005 NIST Language Recognition Evaluation
	Le, Audrey; Martin, Alvin; Hadfield, Hannah. - : Linguistic Data Consortium, 2008. : https://www.ldc.upenn.edu, 2008
	BASE
	Show details

7	2005 NIST Language Recognition Evaluation ...
	Le, Audrey; Martin, Alvin; Hadfield, Hannah. - : Linguistic Data Consortium, 2008
	BASE
	Show details

8	NIST speaker recognition evaluations utilizing the Mixer Corpora - 2004, 2005, 2006
	Przybocki, Mark A.; Le, Audrey N.; Martin, Alvin F.
	In: Institute of Electrical and Electronics Engineers. IEEE transactions on audio, speech and language processing. - New York, NY : Inst. 15 (2007) 7, 1951-1959
	BLLDB
	OLC Linguistik
	Show details

9	2004 Spring NIST Rich Transcription (RT-04S) Development Data
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007. : https://www.ldc.upenn.edu, 2007
	BASE
	Show details

10	2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007. : https://www.ldc.upenn.edu, 2007
	BASE
	Show details

11	2003 NIST Rich Transcription Evaluation Data
	Fiscus, Jonathan G.; Doddington, George R.; Le, Audrey. - : Linguistic Data Consortium, 2007. : https://www.ldc.upenn.edu, 2007
	BASE
	Show details

12	2004 Spring NIST Rich Transcription (RT-04S) Development Data ...
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007
	BASE
	Show details

13	2003 NIST Rich Transcription Evaluation Data ...
	Fiscus, Jonathan G.; Doddington, George R.; Le, Audrey. - : Linguistic Data Consortium, 2007
	BASE
	Show details

14	2004 Spring NIST Rich Transcription (RT-04S) Evaluation Data ...
	Fiscus, Jonathan G.; Garofolo, John S.; Le, Audrey. - : Linguistic Data Consortium, 2007
	BASE
	Show details

15	Effects of speech recognition accuracy on the performance of DARPA Communicator spoken dialogue systems
	Le, Audrey N.; Sanders, Gregory A.
	In: International journal of speech technology. - Boston, Mass. [u.a.] : Kluwer Acad. Publ. 7 (2004) 4, 293-309
	BLLDB
	OLC Linguistik
	Show details

16	2002 Rich Transcription Broadcast News and Conversational Telephone Speech
	Garofolo, John S.; Fiscus, Jonathan G.; Le, Audrey. - : Linguistic Data Consortium, 2004. : https://www.ldc.upenn.edu, 2004
	BASE
	Show details

17	2002 Rich Transcription Broadcast News and Conversational Telephone Speech ...
	Garofolo, John S.; Fiscus, Jonathan G.; Le, Audrey. - : Linguistic Data Consortium, 2004
	BASE
	Show details

18	Effects of Speech Recognition Accuracy on the Performance of DARPA Communicator Spoken Dialogue Systems
	Sanders, Gregory A.; Le, Audrey N.
	In: DTIC (2004)
	BASE
	Show details

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern