1 |
ProZed: A speech prosody editor for linguists, using analysis-by-synthesis.
|
|
|
|
In: Speech Prosody in Speech Synthesis. : Modeling and generation of prosody for high quality and flexible speech synthesis. ; https://hal.archives-ouvertes.fr/hal-01480821 ; Keikichi Hirose And Jianhua Tao. Speech Prosody in Speech Synthesis. : Modeling and generation of prosody for high quality and flexible speech synthesis., Springer Verlag, pp.3-17, 2015, Prosody, Phonology and Phonetics , 978-3-662-45257-8 (2015)
|
|
BASE
|
|
Show details
|
|
2 |
One-to-many voice conversion based on tensor representation of speaker space
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2011/INTERSPEECH_p653-656_t2011-8.pdf (2011)
|
|
BASE
|
|
Show details
|
|
3 |
Dialect-based speaker classification using speaker invariant dialect features
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/%7Emine/paper/PDF/2010/ISCSLP_p171-176_t2010-12.pdf (2010)
|
|
BASE
|
|
Show details
|
|
4 |
Pronunciation Proficiency Estimation Based on Multilayer Regression Analysis Using Speakerindependent Structural Features
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2010/L2WS_O2-03_t2010-9.pdf (2010)
|
|
BASE
|
|
Show details
|
|
5 |
Structural analysis of dialects, sub-dialects, and sub-sub-dialects
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2009/INTERSPEECH_p2219-2222_t2009-9.pdf (2009)
|
|
BASE
|
|
Show details
|
|
6 |
Improved Structure-based Automatic Estimation of Pronunciation Proficiency
|
|
|
|
In: http://www.eee.bham.ac.uk/SLaTE2009/papers/SLaTE2009-21-v2.pdf (2009)
|
|
BASE
|
|
Show details
|
|
7 |
Optimal event search using a structural cost function –improvement of structure to speech conversion
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2009/INTERSPEECH_p2047-2050_t2009-9.pdf (2009)
|
|
BASE
|
|
Show details
|
|
8 |
Automatic recognition of connected vowels only using speaker-invariant representation of speech dynamics
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2007/INTERSPEECH_p890-893_t2007-8.pdf (2007)
|
|
Abstract:
Speech acoustics vary due to differences in gender, age, microphone, room, lines, and a variety of factors. In speech recognition research, to deal with these inevitable non-linguistic variations, thousands of speakers in different acoustic conditions were prepared to train acoustic models of individual phonemes. Recently, a novel representation of speech dynamics was proposed [1, 2], where the above non-linguistic factors are effectively removed from speech as if pitch information is removed from spectrum by its smoothing. This representation captures only speaker- and microphone-invariant speech dynamics and no absolute or static acoustic properties such as spectrums are used. With them, speaker identity has to remain in speech representation. In our previous study, the new representation was applied to recognizing a sequence of isolated vowels [3]. The
|
|
URL: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2007/INTERSPEECH_p890-893_t2007-8.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.434.5049
|
|
BASE
|
|
Hide details
|
|
9 |
Para-linguistic information represented as distortion of the acoustic universal structure in speech
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2006/ICASSP_p261-264_v1_t2006-5.pdf (2006)
|
|
BASE
|
|
Show details
|
|
10 |
Structural representation of the non-native pronunciations
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2005/INTERSPEECH_p165-168_t2005-9.pdf (2005)
|
|
BASE
|
|
Show details
|
|
11 |
Filled pauses as cues to the complexity of following phrases
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2005/INTERSPEECH_p37-40_t2005-9.pdf (2005)
|
|
BASE
|
|
Show details
|
|
12 |
Japanese vowel recognition using external structure of speech
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2005/ASRU_p203-208_t2005-11.pdf (2005)
|
|
BASE
|
|
Show details
|
|
13 |
Use of prosodic features for speech recognition
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2004/ICSLP_p1445-1448_t2004-10.pdf (2004)
|
|
BASE
|
|
Show details
|
|
14 |
Analysis of F0 contours of Cantonese utterances based on the commandresponse model
|
|
|
|
In: http://mirlab.org/conference_papers/International_Conference/ICSLP+2004/contents/TuC_pdf/TuC201p/TuC201p.14_p1127.pdf (2004)
|
|
BASE
|
|
Show details
|
|
15 |
Tone name in Middle Chinese system
|
|
|
|
In: http://isca-speech.org/archive_open/archive_papers/ssw5/ssw5_227.pdf (2004)
|
|
BASE
|
|
Show details
|
|
16 |
Corpus-based synthesis of fundamental frequency contours of Japanese using automatically-generated prosodic corpus and generation process model
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2004/SSW_various-styles_t2004-6.pdf (2003)
|
|
BASE
|
|
Show details
|
|
17 |
Corpus-based synthesis of fundamental frequency contours of Japanese using automatically-generated prosodic corpus and generation process model
|
|
|
|
In: http://isca-speech.org/archive_open/archive_papers/ssw5/ssw5_161.pdf (2003)
|
|
BASE
|
|
Show details
|
|
18 |
Automatic estimation of accentual attribute values of words for accent sandhi rules of Japanese text -to-speech conversion
|
|
|
|
In: http://www.gavo.t.u-tokyo.ac.jp/~mine/paper/PDF/2002/SSW_accent-attribute_t2002-9.pdf (2003)
|
|
BASE
|
|
Show details
|
|
19 |
Data-Driven Synthesis of Fundamental Frequency Contours for TTS Systems Based on a Generation Process Model
|
|
|
|
In: http://www.lpl.univ-aix.fr/sp2002/pdf/hirose-minematsu-eto.pdf (2002)
|
|
BASE
|
|
Show details
|
|
20 |
N-gram Language Modeling of Japanese Using Prosodic Boundaries
|
|
|
|
In: http://www.lpl.univ-aix.fr/sp2002/pdf/hirose-minematsu-terao.pdf (2002)
|
|
BASE
|
|
Show details
|
|
|
|