Catalogue search • Linguistik portal • Fachinformationsdienst (FID)

1	Sanjeev Kumar 1
	David Demirdjian; Alexander Gruenstein; Xiaoguang Li; John Niekrasz; Matt Wesson
	In: http://www-csli.stanford.edu/~alexgru/pubs/icmi04.pdf
	Abstract: We present a video demonstration of an agent-based test bed application for ongoing research into multi-user, multimodal, computer-assisted meetings. The system tracks a two person scheduling meeting: one person standing at a touch sensitive whiteboard creating a Gantt chart, while another person looks on in view of a calibrated stereo camera. The stereo camera performs real-time, untethered, vision-based tracking of the onlooker’s head, torso and limb movements, which in turn are routed to a 3D-gesture recognition agent. Using speech, 3D deictic gesture and 2D object de-referencing the system is able to track the onlooker’s suggestion to move a specific milestone. The system also has a speech recognition agent capable of recognizing out-ofvocabulary (OOV) words as phonetic sequences. Thus when a user at the whiteboard speaks an OOV label name for a chart constituent while also writing it, the OOV speech is combined with letter sequences hypothesized by the handwriting recognizer to yield an orthography, pronunciation and semantics for the new label. These are then learned dynamically by the system and become immediately available for future recognition.
	Keyword: Design; Experimentation; Human Factors. Keywords Multimodal interaction; vision-based body-tracking; vocabulary
	URL: http://www-csli.stanford.edu/~alexgru/pubs/icmi04.pdf http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.141.7911
	BASE
	Hide details

Search in the Catalogues and Directories