DE eng

Search in the Catalogues and Directories

Page: 1 2 3 4 5...212
Hits 1 – 20 of 4.228

1
The Impact of Removing Head Movements on Audio-visual Speech Enhancement
In: ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing ; https://hal.inria.fr/hal-03551610 ; ICASSP 2022 - IEEE International Conference on Acoustics, Speech and Signal Processing, IEEE Signal Processing Society, May 2022, Singapore, Singapore. pp.1-5 (2022)
BASE
Show details
2
An Overview of Indian Spoken Language Recognition from Machine Learning Perspective
In: ISSN: 2375-4699 ; EISSN: 2375-4702 ; ACM Transactions on Asian and Low-Resource Language Information Processing ; https://hal.inria.fr/hal-03616853 ; ACM Transactions on Asian and Low-Resource Language Information Processing, ACM, In press, ⟨10.1145/3523179⟩ (2022)
BASE
Show details
3
BBC-Oxford British Sign Language Dataset
In: https://hal.archives-ouvertes.fr/hal-03516444 ; 2022 (2022)
BASE
Show details
4
Can machines learn to see without visual databases?
In: https://hal.archives-ouvertes.fr/hal-03526569 ; 2022 (2022)
BASE
Show details
5
Large-scale Bilingual Language-Image Contrastive Learning ...
Ko, Byungsoo; Gu, Geonmo. - : arXiv, 2022
BASE
Show details
6
Bridging Video-text Retrieval with Multiple Choice Questions ...
Abstract: Pre-training a model to learn transferable video-text representation for retrieval has attracted a lot of attention in recent years. Previous dominant works mainly adopt two separate encoders for efficient retrieval, but ignore local associations between videos and texts. Another line of research uses a joint encoder to interact video with texts, but results in low efficiency since each text-video pair needs to be fed into the model. In this work, we enable fine-grained video-text interactions while maintaining high efficiency for retrieval via a novel pretext task, dubbed as Multiple Choice Questions (MCQ), where a parametric module BridgeFormer is trained to answer the "questions" constructed by the text features via resorting to the video features. Specifically, we exploit the rich semantics of text (i.e., nouns and verbs) to build questions, with which the video encoder can be trained to capture more regional content and temporal dynamics. In the form of questions and answers, the semantic associations ... : Accepted by CVPR 2022 ...
Keyword: Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences
URL: https://arxiv.org/abs/2201.04850
https://dx.doi.org/10.48550/arxiv.2201.04850
BASE
Hide details
7
DanFEVER: claim verification dataset for Danish ...
Nørregaard, Jeppe; Derczynski, Leon. - : figshare, 2022
BASE
Show details
8
DanFEVER: claim verification dataset for Danish ...
Nørregaard, Jeppe; Derczynski, Leon. - : figshare, 2022
BASE
Show details
9
Towards a Perceptual Model for Estimating the Quality of Visual Speech ...
BASE
Show details
10
An error correction scheme for improved air-tissue boundary in real-time MRI video for speech production ...
BASE
Show details
11
Expression-preserving face frontalization improves visually assisted speech processing ...
BASE
Show details
12
WLASL-LEX: a Dataset for Recognising Phonological Properties in American Sign Language ...
BASE
Show details
13
Modeling Intensification for Sign Language Generation: A Computational Approach ...
BASE
Show details
14
Keypoint based Sign Language Translation without Glosses ...
Kim, Youngmin; Kwak, Minji; Lee, Dain. - : arXiv, 2022
BASE
Show details
15
A Transformer-Based Contrastive Learning Approach for Few-Shot Sign Language Recognition ...
BASE
Show details
16
A Simple Multi-Modality Transfer Learning Baseline for Sign Language Translation ...
Chen, Yutong; Wei, Fangyun; Sun, Xiao. - : arXiv, 2022
BASE
Show details
17
Including Facial Expressions in Contextual Embeddings for Sign Language Generation ...
BASE
Show details
18
Signing at Scale: Learning to Co-Articulate Signs for Large-Scale Photo-Realistic Sign Language Production ...
BASE
Show details
19
Statistical and Spatio-temporal Hand Gesture Features for Sign Language Recognition using the Leap Motion Sensor ...
Bird, Jordan J.. - : arXiv, 2022
BASE
Show details
20
Multi-View Spatial-Temporal Network for Continuous Sign Language Recognition ...
Li, Ronghui; Meng, Lu. - : arXiv, 2022
BASE
Show details

Page: 1 2 3 4 5...212

Catalogues
121
0
554
0
0
1
2
Bibliographies
1.566
0
0
0
0
0
0
0
9
Linked Open Data catalogues
0
Online resources
0
0
0
0
Open access documents
2.646
0
0
0
0
© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern