DE eng

Search in the Catalogues and Directories

Page: 1 2
Hits 1 – 20 of 30

1
Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
BASE
Show details
2
An error correction scheme for improved air-tissue boundary in real-time MRI video for speech production ...
BASE
Show details
3
Expression-preserving face frontalization improves visually assisted speech processing ...
Abstract: Face frontalization consists of synthesizing a frontally-viewed face from an arbitrarily-viewed one. The main contribution of this paper is a frontalization methodology that preserves non-rigid facial deformations in order to boost the performance of visually assisted speech communication. The method alternates between the estimation of (i)~the rigid transformation (scale, rotation, and translation) and (ii)~the non-rigid deformation between an arbitrarily-viewed face and a face model. The method has two important merits: it can deal with non-Gaussian errors in the data and it incorporates a dynamical face deformation model. For that purpose, we use the generalized Student t-distribution in combination with a linear dynamic system in order to account for both rigid head motions and time-varying facial deformations caused by speech production. We propose to use the zero-mean normalized cross-correlation (ZNCC) score to evaluate the ability of the method to preserve facial expressions. The method is thoroughly ...
Keyword: Audio and Speech Processing eess.AS; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences; FOS Electrical engineering, electronic engineering, information engineering; Sound cs.SD
URL: https://dx.doi.org/10.48550/arxiv.2204.02810
https://arxiv.org/abs/2204.02810
BASE
Hide details
4
Freeform Body Motion Generation from Speech ...
Xu, Jing; Zhang, Wei; Bai, Yalong. - : arXiv, 2022
BASE
Show details
5
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling ...
Peng, Puyuan; Harwath, David. - : arXiv, 2022
BASE
Show details
6
Speaker Extraction with Co-Speech Gestures Cue ...
Pan, Zexu; Qian, Xinyuan; Li, Haizhou. - : arXiv, 2022
BASE
Show details
7
Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations ...
Um, Se-Yun; Kim, Jihyun; Lee, Jihyun. - : arXiv, 2021
BASE
Show details
8
Silent Speech and Emotion Recognition from Vocal Tract Shape Dynamics in Real-Time MRI ...
Pandey, Laxmi; Arif, Ahmed Sabbir. - : arXiv, 2021
BASE
Show details
9
Improving Ultrasound Tongue Image Reconstruction from Lip Images Using Self-supervised Learning and Attention Mechanism ...
Liu, Haiyang; Zhang, Jihan. - : arXiv, 2021
BASE
Show details
10
Cascaded Multilingual Audio-Visual Learning from Videos ...
BASE
Show details
11
Speaker embeddings by modeling channel-wise correlations ...
BASE
Show details
12
Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation ...
Khorrami, Khazar; Räsänen, Okko. - : arXiv, 2021
BASE
Show details
13
AudioViewer: Learning to Visualize Sounds ...
BASE
Show details
14
Large-scale multilingual audio visual dubbing ...
BASE
Show details
15
Cross-modal Speaker Verification and Recognition: A Multilingual Perspective ...
BASE
Show details
16
Designing, Playing, and Performing with a Vision-based Mouth Interface ...
BASE
Show details
17
SLNSpeech: solving extended speech separation problem by the help of sign language ...
Wu, Jiasong; Li, Taotao; Kong, Youyong. - : arXiv, 2020
BASE
Show details
18
Unsupervised Audiovisual Synthesis via Exemplar Autoencoders ...
BASE
Show details
19
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation ...
Luo, Huaishao; Ji, Lei; Shi, Botian. - : arXiv, 2020
BASE
Show details
20
Disentangled Speech Embeddings using Cross-modal Self-supervision ...
BASE
Show details

Page: 1 2

Catalogues
0
0
0
0
0
0
0
Bibliographies
0
0
0
0
0
0
0
0
0
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
30
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern