1 |
Speech-Centric Information Processing: An Optimization-Oriented Approach
|
|
|
|
In: http://research.microsoft.com/pubs/179540/ProcIEEE_He_deng_finalsub.pdf (2012)
|
|
Abstract:
Automatic speech recognition is a central and common component of voice-driven information processing systems in human language technology, including spoken language translation, spoken language understanding, voice search, spoken document retrieval, and so on. Interfacing speech recognition with its downstream text-based processing tasks of translation, understanding, and information retrieval creates both challenges and opportunities in optimal design of the combined, speech-enabled systems. We present an optimization-oriented statistical framework for the overall system design where the interactions between the sub-systems in tandem are fully incorporated and where design consistency is established between the optimization objectives and the end-to-end system performance metrics. Techniques for optimizing such objectives in both the decoding and learning phases of the speech-centric information processing system design are described, in which the uncertainty in speech recognition sub-system’s outputs is fully considered and marginalized. This paper provides an overview of the past and current work in this area. Future challenges and new opportunities are also discussed and analyzed.
|
|
Keyword:
Index Terms — Speech-Centric Information Processing; Joint Optimization A; Speech Recognition; Spoken Language Translation; Spoken Language Understanding; Voice Search
|
|
URL: http://research.microsoft.com/pubs/179540/ProcIEEE_He_deng_finalsub.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.303.4845
|
|
BASE
|
|
Hide details
|
|
2 |
Imperatives, Discourse Markers, and Grammaticalization É vida, olha…: Imperatives as Discourse Markers and Grammaticalization Paths in Romance
|
|
|
|
In: http://hal.inria.fr/docs/00/63/72/32/PDF/Fagard_Languages_in_Contrast-Revised_version.pdf (2011)
|
|
BASE
|
|
Show details
|
|
3 |
Semantic understanding by combining extended cfg parser with hmm model,” Submitted to These Proceedings
|
|
|
|
In: http://groups.csail.mit.edu/sls/publications/2010/Xu_SLT_2010.pdf (2010)
|
|
BASE
|
|
Show details
|
|
4 |
Speech to sign language translation system for Spanish
|
|
|
|
In: http://lorien.die.upm.es/~lfdharo/Papers/Speech2SignLanguage_SpeechCom2008.pdf (2008)
|
|
BASE
|
|
Show details
|
|
5 |
Sentence Segmentation and Punctuation Recovery for Spoken Language Translation
|
|
|
|
In: http://csl.ira.uka.de/fileadmin/media/publication_files/PaulikSchultz-ICASSP2008.pdf (2008)
|
|
BASE
|
|
Show details
|
|
6 |
Statistical and computational models of the visual world paradigm: Growth curves and individual differences
|
|
|
|
In: http://magnuson.psy.uconn.edu/pdfs/MirmanDixonMagnusonJML2008.pdf (2008)
|
|
BASE
|
|
Show details
|
|
7 |
Automatic Decision Detection in Meeting Speech
|
|
|
|
In: http://www.iccs.informatics.ed.ac.uk/~jmoore/papers/mlmi07.pdf (2007)
|
|
BASE
|
|
Show details
|
|
8 |
Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA
|
|
|
|
In: http://lia.univ-avignon.fr/fileadmin/documents/Users/Intranet/chercheurs/bechet/publifred/FB_2006_INTERSPEECH_1.pdf (2006)
|
|
BASE
|
|
Show details
|
|
9 |
Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA
|
|
|
|
In: http://www-lium.univ-lemans.fr/%7Eservan/publications/Servan_Interspeech2006.pdf (2006)
|
|
BASE
|
|
Show details
|
|
10 |
Conceptual decoding from word lattices: application to the spoken dialogue corpus MEDIA
|
|
|
|
In: http://www.ist-luna.eu/pdf/IS061416.pdf (2006)
|
|
BASE
|
|
Show details
|
|
11 |
Confidence Estimation for NLP Applications
|
|
|
|
In: http://iit-iti.nrc-cnrc.gc.ca/iit-publications-iti/docs/NRC-48755.pdf (2006)
|
|
BASE
|
|
Show details
|
|
12 |
The neural mechanisms of speech comprehension: fMRI studies of semantic ambiguity
|
|
|
|
In: http://www.mrc-cbu.cam.ac.uk/personal/matt.davis/pubs/rodd_davis_johnsrude.cerebral_cortex2005.pdf (2005)
|
|
BASE
|
|
Show details
|
|
13 |
The neural mechanisms of speech comprehension: fMRI studies of semantic ambiguity
|
|
|
|
In: http://cercor.oxfordjournals.org/content/15/8/1261.full.pdf (2005)
|
|
BASE
|
|
Show details
|
|
14 |
The neural mechanisms of speech comprehension: fMRI studies of semantic ambiguity
|
|
|
|
In: http://cercor.oxfordjournals.org/content/early/2005/01/05/cercor.bhi009.full.pdf (2005)
|
|
BASE
|
|
Show details
|
|
15 |
Discriminative and maximum likelihood classifiers for computer-based visual feedback for speech training for the hearing impaired
|
|
|
|
In: http://www.ws.binghamton.edu/zahorian/pdf/Discriminative and Maximum Likelihood Classifiers for Computer-Based Visual Feedback for Speech Training.pdf (2000)
|
|
BASE
|
|
Show details
|
|
16 |
Detecting Acoustic Morphemes in Lattices for Spoken Language Understanding
|
|
|
|
In: http://www.research.att.com/~algor/hmihy/papers/gorin684.ps (2000)
|
|
BASE
|
|
Show details
|
|
17 |
Speaking in shorthand -- A syllable-centric perspective for understanding . . .
|
|
|
|
In: http://www.icsi.berkeley.edu/~steveng/PDF/SpeakingInShorthandMIME.pdf (1999)
|
|
BASE
|
|
Show details
|
|
18 |
Dealing With Multilinguality In A Spoken Language Query Translator
|
|
|
|
In: http://www.ee.ust.hk/~pascale/slt97camera.ps (1997)
|
|
BASE
|
|
Show details
|
|
19 |
A Survey on Chinese Speech Recognition
|
|
|
|
In: ftp://ftp.iscs.nus.sg/pub/commcolips/p96001.ps (1996)
|
|
BASE
|
|
Show details
|
|
20 |
WHAT MAKES SPEECH STICK?
|
|
|
|
In: http://www.icphs2007.de/conference/Papers/1760/1760.pdf (1760)
|
|
BASE
|
|
Show details
|
|
|
|