1 |
Prosodic Hierarchy as an Organizing Framework for the Sources of Context in Phone-Based and Articulatory-Feature-Based Speech Recognition
|
|
|
|
In: http://www.ifp.uiuc.edu/speech/pubs/2006/hasegawa-johnson06lpss.pdf (2007)
|
|
BASE
|
|
Show details
|
|
2 |
Prosody dependent speech recognition on radio news corpus of American English
|
|
|
|
In: http://prosody.beckman.illinois.edu/pubs/20_Chen_etal_2007_IEEE-TransSpAudProc.pdf (2006)
|
|
BASE
|
|
Show details
|
|
3 |
LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004 Johns Hopkins Summer Workshop
|
|
|
|
In: http://www.ifp.uiuc.edu/speech/pubs/2005/hasegawa-johnson05icassp.pdf (2005)
|
|
BASE
|
|
Show details
|
|
4 |
An Automatic Prosody Labeling System Using Ann-Based
|
|
|
|
In: http://www.ifp.uiuc.edu/speech/pubs/2004/Chen_ProdRec_ICASSP04.pdf (2004)
|
|
BASE
|
|
Show details
|
|
5 |
An intonational phrase boundary and pitch accent dependent speech recognizer
|
|
|
|
In: http://www.ifp.uiuc.edu/speech/pubs/2003/chen03sci.pdf (2003)
|
|
BASE
|
|
Show details
|
|
6 |
Automatic Recognition of Pitch Movements Using Time-Delay Recursive Neural Network
|
|
|
|
In: http://www.ifp.uiuc.edu/speech/pubs/2003/kim_spl.pdf (2003)
|
|
Abstract:
This paper proposes a novel method for the automatic recognition of pitch accents with no prior knowledge about the phonetic content of the signal (no knowledge of word or phoneme boundaries or of phoneme labels). In the framework presented here, the problem of pitch accent recognition is considered to be a special case of the general problem of context-dependent, non-parametric dynamic contour recognition. The recognition problem is non-parametric because the distribution of F0 is unknown; in particular, there is no evidence that the distribution of F0 is Gaussian. The recognition problem is context-dependent because F0 encodes much more than just prosody: in particular, talker dependence, dependence on speaking style, and short-time acoustic phonetic information encoded in the F0 trajectory must be ignored. The recognition algorithm used in this paper is a time-delay recursive neural network (TDRNN) [8]. A TDRNN is a neural network classifier with two di#erent representations of dynamic context: delayed input nodes allow the representation of an School of Computer and Information, Yong-In University, South Korea; sskim@yongin.ac.kr ECE Department, University of Illinois at Urbana-Champaign; {jhasegaw,kenchen}@uiuc.edu explicit trajectory F 0(t), while recursive nodes provide long-term context information that can be used to normalize the input trajectory. Section 2 of this paper describes a selection of papers in the field of prosodydependent speech recognition, and briefly discusses the importance of the problem. Section 3 describes the TDRNN architecture used in these experiments. Section 4 describes the experimental methods used for the recognition of pitch accents. Section 5 gives the results, and Section 6 presents conclusions
|
|
URL: http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.8.1493 http://www.ifp.uiuc.edu/speech/pubs/2003/kim_spl.pdf
|
|
BASE
|
|
Hide details
|
|
7 |
Improving the Robustness of Prosody Dependent Language Modeling Based on Prosody Syntax Dependence
|
|
|
|
In: http://www.ifp.uiuc.edu/speech/pubs/2003/Chen_PDLM_ASRU03.pdf (2003)
|
|
BASE
|
|
Show details
|
|
8 |
LANDMARK-BASED SPEECH RECOGNITION: REPORT OF THE 2004
|
|
|
|
In: http://www.sls.csail.mit.edu/sls/publications/2005/hasegawa-johnson_etal_ICASSP05.pdf
|
|
BASE
|
|
Show details
|
|
|
|