1 |
Breathing and Speech Planning in Spontaneous Speech Synthesis
|
|
|
|
BASE
|
|
Show details
|
|
2 |
Style-Controllable Speech-Driven Gesture Synthesis Using Normalising Flows
|
|
Kucherenko, Taras; Henter, Gustav Eje; Beskow, Jonas. - : KTH, Tal, musik och hörsel, TMH, 2020. : KTH, Robotik, perception och lärande, RPL, 2020. : Wiley, 2020
|
|
BASE
|
|
Show details
|
|
3 |
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition
|
|
Stefanov, Kalin; Beskow, Jonas; Salvi, Giampiero. - : KTH, Tal, musik och hörsel, TMH, 2020. : Institute for Creative Technologies, University of Southern California, Los Angeles, CA 90089, United States, 2020. : NTNU Norwegian University of Science and Technology, Trondheim, Norway, 2020. : Institute of Electrical and Electronics Engineers (IEEE), 2020
|
|
BASE
|
|
Show details
|
|
4 |
The speech synthesis phoneticians need is both realistic and controllable ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
The speech synthesis phoneticians need is both realistic and controllable ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
PROMIS: a statistical-parametric speech synthesis system with prominence control via a prominence network
|
|
Malisz, Zofia; Berthelsen, Harald; Beskow, Jonas. - : KTH, Tal, musik och hörsel, TMH, 2019. : KTH, Tal-kommunikation, 2019. : STTS – Södermalms talteknologiservice AB, 2019. : Vienna, 2019
|
|
BASE
|
|
Show details
|
|
7 |
Modern speech synthesis for phonetic sciences : A discussion and an evaluation
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Off the cuff: Exploring extemporaneous speech delivery with TTS
|
|
|
|
BASE
|
|
Show details
|
|
9 |
The speech synthesis phoneticians need is both realistic and controllable
|
|
Malisz, Zofia; Henter, Gustav Eje; Valentini-Botinhao, Cassia. - : KTH, Tal, musik och hörsel, TMH, 2019. : KTH, Tal-kommunikation, 2019. : The Centre for Speech Technology, The University of Edinburgh, UK, 2019. : Stockholm, 2019
|
|
BASE
|
|
Show details
|
|
11 |
A Multimodal Corpus for Mutual Gaze and Joint Attention in Multiparty Situated Interaction
|
|
Kontogiorgos, Dimosthenis; Avramova, Vanya; Alexanderson, Simon. - : KTH, Tal, musik och hörsel, TMH, 2018. : KTH, 2018. : Ecole Polytechnique Fédérale de Lausanne (EPFL), Switzerland, 2018. : Paris, 2018
|
|
BASE
|
|
Show details
|
|
12 |
The proceedings of the 14th International Conference on Auditory-Visual Speech Processing
|
|
|
|
In: The 14th International Conference on Auditory-Visual Speech Processing (AVSP2017) ; https://hal.inria.fr/hal-01596625 ; The 14th International Conference on Auditory-Visual Speech Processing (AVSP2017), Aug 2017, Stockholm, Sweden. 2017 ; http://avsp2017.loria.fr (2017)
|
|
BASE
|
|
Show details
|
|
13 |
Self-Supervised Vision-Based Detection of the Active Speaker as Support for Socially-Aware Language Acquisition ...
|
|
|
|
BASE
|
|
Show details
|
|
14 |
Using deep neural networks to estimate tongue movements from speech face motion
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Tutoring Robots
|
|
|
|
In: IFIP Advances in Information and Communication Technology ; 9th International Summer Workshop on Multimodal Interfaces (eNTERFACE) ; https://hal.inria.fr/hal-01350740 ; 9th International Summer Workshop on Multimodal Interfaces (eNTERFACE), Jul 2013, Lisbon, Portugal. pp.80-113, ⟨10.1007/978-3-642-55143-7_4⟩ (2013)
|
|
BASE
|
|
Show details
|
|
18 |
Visual Recognition of Isolated Swedish Sign Language Signs ...
|
|
|
|
Abstract:
We present a method for recognition of isolated Swedish Sign Language signs. The method will be used in a game intended to help children training signing at home, as a complement to training with a teacher. The target group is not primarily deaf children, but children with language disorders. Using sign language as a support in conversation has been shown to greatly stimulate the speech development of such children. The signer is captured with an RGB-D (Kinect) sensor, which has three advantages over a regular RGB camera. Firstly, it allows complex backgrounds to be removed easily. We segment the hands and face based on skin color and depth information. Secondly, it helps with the resolution of hand over face occlusion. Thirdly, signs take place in 3D; some aspects of the signs are defined by hand motion vertically to the image plane. This motion can be estimated if the depth is observable. The 3D motion of the hands relative to the torso are used as a cue together with the hand shape, and HMMs trained with ...
|
|
Keyword:
Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
|
|
URL: https://arxiv.org/abs/1211.3901 https://dx.doi.org/10.48550/arxiv.1211.3901
|
|
BASE
|
|
Hide details
|
|
|
|