1 |
The Impact of Removing Head Movements on Audio-visual Speech Enhancement
|
|
|
|
In: ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.inria.fr/hal-03551610 ; ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE Signal Processing Society, May 2022, Singapore, Singapore. pp.1-5 (2022)
|
|
BASE
|
|
Show details
|
|
2 |
An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
|
|
|
|
In: ISSN: 2375-4699 ; EISSN: 2375-4702 ; ACM Transactions on Asian and Low-Resource Language Information Processing ; https://hal.inria.fr/hal-03616853 ; ACM Transactions on Asian and Low-Resource Language Information Processing, ACM, In press, ⟨10.1145/3523179⟩ (2022)
|
|
BASE
|
|
Show details
|
|
3 |
BBC-Oxford British Sign Language Dataset
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03516444 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
4 |
Can machines learn to see without visual databases?
|
|
|
|
In: https://hal.archives-ouvertes.fr/hal-03526569 ; 2022 (2022)
|
|
BASE
|
|
Show details
|
|
5 |
Large-scale Bilingual Language-Image Contrastive Learning ...
|
|
|
|
BASE
|
|
Show details
|
|
6 |
Bridging Video-text Retrieval with Multiple Choice Questions ...
|
|
|
|
BASE
|
|
Show details
|
|
9 |
Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
|
|
|
|
BASE
|
|
Show details
|
|
10 |
An error correction scheme for improved air-tissue boundary in real-time MRI video for speech production ...
|
|
|
|
BASE
|
|
Show details
|
|
11 |
Expression-preserving face frontalization improves visually assisted speech processing ...
|
|
|
|
BASE
|
|
Show details
|
|
12 |
WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language ...
|
|
|
|
BASE
|
|
Show details
|
|
13 |
Modeling Intensification for Sign Language Generation: A Computational Approach ...
|
|
|
|
Abstract:
End-to-end sign language generation models do not accurately represent the prosody in sign language. A lack of temporal and spatial variations leads to poor-quality generated presentations that confuse human interpreters. In this paper, we aim to improve the prosody in generated sign languages by modeling intensification in a data-driven manner. We present different strategies grounded in linguistics of sign language that inform how intensity modifiers can be represented in gloss annotations. To employ our strategies, we first annotate a subset of the benchmark PHOENIX-14T, a German Sign Language dataset, with different levels of intensification. We then use a supervised intensity tagger to extend the annotated dataset and obtain labels for the remaining portion of it. This enhanced dataset is then used to train state-of-the-art transformer models for sign language generation. We find that our efforts in intensification modeling yield better results when evaluated with automatic metrics. Human evaluation ... : 15 pages, Findings of the Association for Computational Linguistics: ACL 2022 ...
|
|
Keyword:
Artificial Intelligence cs.AI; Computation and Language cs.CL; Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
|
|
URL: https://dx.doi.org/10.48550/arxiv.2203.09679 https://arxiv.org/abs/2203.09679
|
|
BASE
|
|
Hide details
|
|
14 |
Keypoint based Sign Language Translation without Glosses ...
|
|
|
|
BASE
|
|
Show details
|
|
15 |
A Transformer-Based Contrastive Learning Approach for Few-Shot Sign Language Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
16 |
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation ...
|
|
|
|
BASE
|
|
Show details
|
|
17 |
Including Facial Expressions in Contextual Embeddings for Sign Language Generation ...
|
|
|
|
BASE
|
|
Show details
|
|
18 |
Signing at Scale: Learning to Co-Articulate Signs for Large-Scale Photo-Realistic Sign Language Production ...
|
|
|
|
BASE
|
|
Show details
|
|
19 |
Statistical and Spatio-temporal Hand Gesture Features for Sign Language Recognition using the Leap Motion Sensor ...
|
|
|
|
BASE
|
|
Show details
|
|
20 |
Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition ...
|
|
|
|
BASE
|
|
Show details
|
|
|
|