1 |
Machine Recognition vs Human Recognition of Voices
|
|
|
|
In: DTIC (2012)
|
|
BASE
|
|
Show details
|
|
2 |
Using Prosody for Automatic Sentence Segmentation of Multi-Party Meetings
|
|
|
|
In: DTIC (2006)
|
|
Abstract:
We explore the use of prosodic features beyond pauses, including duration, pitch, and energy features, for automatic sentence segmentation of ICSI meeting data. We examine two different approaches to boundary classification: score-level combination of independent language and prosodic models using HMMs, and feature-level combination of models using a boosting-based method (BoosTexter). We report classification results for reference word transcripts as well as for transcripts from a state-of-the-art automatic speech recognizer (ASR). We also compare results using the lexical model plus a pause-only prosody model, versus results using additional prosodic features. Results show that (1) information from pauses is important, including pause duration both at the boundary and at the previous and following word boundaries; (2) adding duration, pitch, and energy features yields significant improvement over pause alone; (3) the integrated boosting-based model performs better than the HMM for ASR conditions; (4) training the boosting-based model on recognized words yields further improvement. ; Presented at the International Conference on Text, Speech, and Dialogue (9th) (TSD 2006) held in Brno, Czech Republic on 11-15 Sep 2006. Published in the Proceedings of the International Conference on Text, Speech, and Dialogue (9th), 2006. Sponsored in part by National Science Foundation contract no. IIS-0121396.
|
|
Keyword:
*BOUNDARIES; *MODELS; *PROSODIC FEATURES; *PROSODY; *SENTENCE SEGMENTATION; *SPEECH RECOGNITION; ALGORITHMS; ASR(AUTOMATIC SPEECH RECOGNIZER); AUTOMATIC; BOOSTING; CLASSIFICATION; CLASSIFIERS; Cybernetics; HMM(HIDDEN MARKOV MODELS); LEXICAL FEATURES; Linguistics; MARKOV PROCESSES; PAUSES; SYMPOSIA; TEAMS(PERSONNEL); Voice Communications; WORDS(LANGUAGE)
|
|
URL: http://www.dtic.mil/docs/citations/ADA459015 http://oai.dtic.mil/oai/oai?&verb=getRecord&metadataPrefix=html&identifier=ADA459015
|
|
BASE
|
|
Hide details
|
|
3 |
Text Detection and Translation from Natural Scenes
|
|
|
|
In: DTIC (2001)
|
|
BASE
|
|
Show details
|
|
4 |
Isolated Speech Recognition Using Artificial Neural Networks
|
|
|
|
In: DTIC (2001)
|
|
BASE
|
|
Show details
|
|
5 |
Automatic Language Identification with Sequences of Language-Independent Phoneme Clusters.
|
|
|
|
In: DTIC AND NTIS (1996)
|
|
BASE
|
|
Show details
|
|
6 |
Diphone-Based Speech Recognition Using Neural Networks.
|
|
|
|
In: DTIC AND NTIS (1996)
|
|
BASE
|
|
Show details
|
|
7 |
A Robust Loose Coupling for Speech Recognition and Natural Understanding.
|
|
|
|
In: DTIC AND NTIS (1995)
|
|
BASE
|
|
Show details
|
|
8 |
Spoken Dialogue Understanding and Local Context.
|
|
|
|
In: DTIC AND NTIS (1994)
|
|
BASE
|
|
Show details
|
|
9 |
Segment-Based Acoustic Models for Continuous Speech Recognition.
|
|
|
|
In: DTIC AND NTIS (1994)
|
|
BASE
|
|
Show details
|
|
10 |
Robust Continuous Speech Recognition.
|
|
|
|
In: DTIC AND NTIS (1994)
|
|
BASE
|
|
Show details
|
|
11 |
Segment-Based Acoustic Models for Continuous Speech Recognition.
|
|
|
|
In: DTIC AND NTIS (1994)
|
|
BASE
|
|
Show details
|
|
12 |
TRAINS: Dialogue Transcription Tools.
|
|
|
|
In: DTIC AND NTIS (1994)
|
|
BASE
|
|
Show details
|
|
13 |
High-Performance Speech Recognition Using Consistency Modeling.
|
|
|
|
In: DTIC AND NTIS (1994)
|
|
BASE
|
|
Show details
|
|
14 |
Dialog Structure and Plan Recognition in Spontaneous Spoken Dialog
|
|
|
|
In: DTIC AND NTIS (1993)
|
|
BASE
|
|
Show details
|
|
15 |
Research on Narrowband Communications
|
|
|
|
In: DTIC AND NTIS (1981)
|
|
BASE
|
|
Show details
|
|
16 |
Research on Narrowband Communications
|
|
|
|
In: DTIC AND NTIS (1980)
|
|
BASE
|
|
Show details
|
|
19 |
Enhancement and Bandwidth Compression of Noisy Speech by Estimation of Speech and Its Model Parameters
|
|
|
|
In: DTIC AND NTIS (1978)
|
|
BASE
|
|
Show details
|
|
|
|