DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4
Hits 1 – 20 of 79

1
RATS Speaker Identification
Graff, David; Ma, Xiaoyi; Strassel, Stephanie; Walker, Kevin; Jones, Karen. - : Linguistic Data Consortium, 2021. : https://www.ldc.upenn.edu, 2021
Abstract: *Introduction* RATS Speaker Identification was developed by the Linguistic Data Consortium (LDC) and is comprised of approximately 1,900 hours of Levantine Arabic, Farsi, Dari, Pashto and Urdu conversational telephone speech with annotations of speech segments. The audio was retransmitted over eight channels, making 17,000 hours of total audio. The corpus was created to provide training and development sets for the Speaker Identification (SID) task in the DARPA RATS (Robust Automatic Transcription of Speech) program. The goal of the RATS program was to develop human language technology systems capable of performing speech detection, language identification, speaker identification and keyword spotting on the severely degraded audio signals that are typical of various radio communication channels, especially those employing various types of handheld portable transceiver systems. To support that goal, LDC assembled a system for the transmission, reception and digital capture of audio data that allowed a single source audio signal to be distributed and recorded over eight distinct transceiver configurations simultaneously. Those configurations included three frequencies -- high, very high and ultra high -- variously combined with amplitude modulation, frequency hopping spread spectrum, narrow-band frequency modulation, single-side-band or wide-band frequency modulation. Annotations on the clear source audio signal, e.g., time boundaries for the duration of speech activity, were projected onto the corresponding eight channels recorded from the radio receivers. *Data* The source audio consists of conversational telephone speech recordings collected by LDC specifically for the RATS program from Levantine Arabic, Pashto, Urdu, Farsi and Dari native speakers. Annotations on the audio files include start time, end time, speech activity detection (SAD) label, SAD provenance, speaker ID, speaker ID provenance, language ID, and language ID provenance. The data is divided into training and development sets, each containing their own audio and annotation subdirectories. All audio files are presented as single-channel, 16-bit PCM, 16000 samples per second; lossless FLAC compression is used on all files. When uncompressed, the files have typical "MS-WAV" (RIFF) file headers. Annotation files are presented as tab-delimited, UTF-8 encoded, plain text. *Sponsorship* This material is based upon work supported by the Defense Advanced Research Projects Agency (DARPA) under Contract No. D10PC20016. The content does not necessarily reflect the position or the policy of the Government, and no official endorsement should be inferred. *Samples* Please view the following samples: * Source Audio Sample (FLAC) * Annotation Sample (TXT) * Retransmission Audio Sample (FLAC) * Retransmission Annotation Sample (TXT) *Updates* None at this time.
URL: https://catalog.ldc.upenn.edu/LDC2021S08
BASE
Hide details
2
RATS Speaker Identification ...
Graff, David; Ma, Xiaoyi; Strassel, Stephanie. - : Linguistic Data Consortium, 2021
BASE
Show details
3
LORELEI Ukrainian Representative Language Pack
Tracey, Jennifer; Strassel, Stephanie; Graff, David. - : Linguistic Data Consortium, 2020. : https://www.ldc.upenn.edu, 2020
BASE
Show details
4
Chinese Lexical Resources for Gender, Number, Animacy
Chen, Song; Yuan, Jiahong; Ma, Xiaoyi. - : Linguistic Data Consortium, 2020. : https://www.ldc.upenn.edu, 2020
BASE
Show details
5
LORELEI Ukrainian Representative Language Pack ...
Tracey, Jennifer; Strassel, Stephanie; Graff, David. - : Linguistic Data Consortium, 2020
BASE
Show details
6
Chinese Lexical Resources for Gender, Number, Animacy ...
Chen, Song; Yuan, Jiahong; Ma, Xiaoyi. - : Linguistic Data Consortium, 2020
BASE
Show details
7
LORELEI Somali Representative Language Pack - Monolingual and Parallel Text
Tracey, Jennifer; Graff, David; Strassel, Stephanie. - : Linguistic Data Consortium, 2018. : https://www.ldc.upenn.edu, 2018
BASE
Show details
8
RATS Language Identification
Graff, David; Ma, Xiaoyi; Strassel, Stephanie. - : Linguistic Data Consortium, 2018. : https://www.ldc.upenn.edu, 2018
BASE
Show details
9
LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text
Tracey, Jennifer; Graff, David; Strassel, Stephanie. - : Linguistic Data Consortium, 2018. : https://www.ldc.upenn.edu, 2018
BASE
Show details
10
LORELEI Amharic Representative Language Pack - Monolingual and Parallel Text ...
Tracey, Jennifer; Graff, David; Strassel, Stephanie. - : Linguistic Data Consortium, 2018
BASE
Show details
11
RATS Language Identification ...
Graff, David; Ma, Xiaoyi; Strassel, Stephanie. - : Linguistic Data Consortium, 2018
BASE
Show details
12
LORELEI Somali Representative Language Pack - Monolingual and Parallel Text ...
Tracey, Jennifer; Graff, David; Strassel, Stephanie. - : Linguistic Data Consortium, 2018
BASE
Show details
13
RATS Keyword Spotting
Graff, David; Ma, Xiaoyi; Strassel, Stephanie. - : Linguistic Data Consortium, 2017. : https://www.ldc.upenn.edu, 2017
BASE
Show details
14
GALE English-Chinese Parallel Aligned Treebank -- Training
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie. - : Linguistic Data Consortium, 2017. : https://www.ldc.upenn.edu, 2017
BASE
Show details
15
RATS Keyword Spotting ...
Graff, David; Ma, Xiaoyi; Strassel, Stephanie. - : Linguistic Data Consortium, 2017
BASE
Show details
16
GALE English-Chinese Parallel Aligned Treebank -- Training ...
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie. - : Linguistic Data Consortium, 2017
BASE
Show details
17
A aprendizagem da língua portuguesa pela comunidade chinesa a residir em Portugal: um estudo sobre diferentes gerações
Ma Xiaoyi. - 2016
BASE
Show details
18
GALE Chinese-English Parallel Aligned Treebank -- Training
Li, Xuansong; Grimes, Stephen; Strassel, Stephanie. - : Linguistic Data Consortium, 2015. : https://www.ldc.upenn.edu, 2015
BASE
Show details
19
RATS Speech Activity Detection
Walker, Kevin; Ma, Xiaoyi; Graff, David. - : Linguistic Data Consortium, 2015. : https://www.ldc.upenn.edu, 2015
BASE
Show details
20
RATS Speech Activity Detection ...
Walker, Kevin; Ma, Xiaoyi; Graff, David. - : Linguistic Data Consortium, 2015
BASE
Show details

Page: 1 2 3 4

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
1
0
0
0
Open access documents
78
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern