2 |
An error correction scheme for improved air-tissue boundary in real-time MRI video for speech production ...
|
|
|
|
BASE
|
|
Show details
|
|
3 |
Expression-preserving face frontalization improves visually assisted speech processing ...
|
|
|
|
BASE
|
|
Show details
|
|
5 |
Self-Supervised Representation Learning for Speech Using Visual Grounding and Masked Language Modeling ...
|
|
|
|
BASE
|
|
Show details
|
|
7 |
Facetron: Multi-speaker Face-to-Speech Model based on Cross-modal Latent Representations ...
|
|
|
|
BASE
|
|
Show details
|
|
8 |
Silent Speech and Emotion Recognition from Vocal Tract Shape Dynamics in Real-Time MRI ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Improving Ultrasound Tongue Image Reconstruction from Lip Images Using Self-supervised Learning and Attention Mechanism ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
Cascaded Multilingual Audio-Visual Learning from Videos ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Speaker embeddings by modeling channel-wise correlations ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
Can phones, syllables, and words emerge as side-products of cross-situational audiovisual learning? -- A computational investigation ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
Cross-modal Speaker Verification and Recognition: A Multilingual Perspective ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
Designing, Playing, and Performing with a Vision-based Mouth Interface ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
SLNSpeech: solving extended speech separation problem by the help of sign language ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Unsupervised Audiovisual Synthesis via Exemplar Autoencoders ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
UniVL: A Unified Video and Language Pre-Training Model for Multimodal Understanding and Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Disentangled Speech Embeddings using Cross-modal Self-supervision ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|