1 |
Breathing and Speech Planning in Spontaneous Speech Synthesis
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows
|
|
Kucherenko, Taras; Henter, Gustav Eje; Beskow, Jonas. - : KTH, Tal, musik och hörsel, TMH, 2020. : KTH, Robotik, perception och lärande, RPL, 2020. : Wiley, 2020
|
|
BASE
|
|
Show details
|
|
3 |
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
|
|
Stefanov, Kalin; Beskow, Jonas; Salvi, Giampiero. - : KTH, Tal, musik och hörsel, TMH, 2020. : Institute for Creative Technologies, University of Southern California, Los Angeles, CA 90089, United States, 2020. : NTNU Norwegian University of Science and Technology, Trondheim, Norway, 2020. : Institute of Electrical and Electronics Engineers (IEEE), 2020
|
|
BASE
|
|
Show details
|
|
4 |
The speech synthesis phoneticians need is both realistic and controllable ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The speech synthesis phoneticians need is both realistic and controllable ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
PROMIS: a statistical-parametric speech synthesis system with prominence control via a prominence network
|
|
Malisz, Zofia; Berthelsen, Harald; Beskow, Jonas. - : KTH, Tal, musik och hörsel, TMH, 2019. : KTH, Tal-kommunikation, 2019. : STTS – Södermalms talteknologiservice AB, 2019. : Vienna, 2019
|
|
BASE
|
|
Show details
|
|
7 |
Modern speech synthesis for phonetic sciences : A discussion and an evaluation
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Off the cuff: Exploring extemporaneous speech delivery with TTS
|
|
|
|
BASE
|
|
Show details
|
|
9 |
The speech synthesis phoneticians need is both realistic and controllable
|
|
Malisz, Zofia; Henter, Gustav Eje; Valentini-Botinhao, Cassia. - : KTH, Tal, musik och hörsel, TMH, 2019. : KTH, Tal-kommunikation, 2019. : The Centre for Speech Technology, The University of Edinburgh, UK, 2019. : Stockholm, 2019
|
|
BASE
|
|
Show details
|
|
11 |
A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
|
|
Kontogiorgos, Dimosthenis; Avramova, Vanya; Alexanderson, Simon; Jonell, Patrik; Oertel, Catharine; Beskow, Jonas; Skantze, Gabriel; Gustafson, Joakim. - : KTH, Tal, musik och hörsel, TMH, 2018. : KTH, 2018. : Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, 2018. : Paris, 2018
|
|
Abstract:
In this paper we present a corpus of multiparty situated interaction where participants collaborated on moving virtual objects on a large touch screen. A moderator facilitated the discussion and directed the interaction. The corpus contains recordings of a variety of multimodal data, in that we captured speech, eye gaze and gesture data using a multisensory setup (wearable eye trackers, motion capture and audio/video). Furthermore, in the description of the multimodal corpus, we investigate four different types of social gaze: referential gaze, joint attention, mutual gaze and gaze aversion by both perspectives of a speaker and a listener. We annotated the groups’ object references during object manipulation tasks and analysed the group’s proportional referential eye-gaze with regards to the referent object. When investigating the distributions of gaze during and before referring expressions we could corroborate the differences in time between speakers’ and listeners’ eye gaze found in earlier studies. This corpus is of particular interest to researchers who are interested in social eye-gaze patterns in turn-taking and referring language in situated multi-party interaction. ; QC 20180614
|
|
Keyword:
Engineering and Technology; Teknik och teknologier
|
|
URL: http://urn.kb.se/resolve?urn=urn:nbn:se:kth:diva-230238
|
|
BASE
|
|
Hide details
|
|
12 |
The proceedings of the 14th International Conference on Auditory-Visual Speech Processing
|
|
|
|
In: The 14th International Conference on Auditory-Visual Speech Processing (AVSP2017) ; https://hal.inria.fr/hal-01596625 ; The 14th International Conference on Auditory-Visual Speech Processing (AVSP2017), Aug 2017, Stockholm, Sweden. 2017 ; http://avsp2017.loria.fr (2017)
|
|
BASE
|
|
Show details
|
|
13 |
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Using deep neural networks to estimate tongue movements from speech face motion
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Tutoring Robots
|
|
|
|
In: IFIP Advances in Information and Communication Technology ; 9th International Summer Workshop on Multimodal Interfaces (eNTERFACE) ; https://hal.inria.fr/hal-01350740 ; 9th International Summer Workshop on Multimodal Interfaces (eNTERFACE), Jul 2013, Lisbon, Portugal. pp.80-113, ⟨10.1007/978-3-642-55143-7_4⟩ (2013)
|
|
BASE
|
|
Show details
|
|
18 |
Visual Recognition of Isolated Swedish Sign Language Signs ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|