Home Catalogue search

eng

Refine your search:

Search in the Catalogues and Directories






	Sort by
Simple Search

Page: 1 2 3 4 5 6 7 8...870

Hits 61 – 80 of 17.396

61	Word separation in continuous sign language using isolated signs and post-processing ...
	Rastgoo, Razieh; Kiani, Kourosh; Escalera, Sergio. - : arXiv, 2022
	BASE
	Show details

62	Exploring Sub-skeleton Trajectories for Interpretable Recognition of Sign Language ...
	Gudmundsson, Joachim; Seybold, Martin P.; Pfeifer, John. - : arXiv, 2022
	BASE
	Show details

63	ASL-Skeleton3D and ASL-Phono: Two Novel Datasets for the American Sign Language ...
	de Amorim, Cleison Correia; Zanchettin, Cleber. - : arXiv, 2022
	BASE
	Show details

64	TFS Recognition: Investigating MPH]{Thai Finger Spelling Recognition: Investigating MediaPipe Hands Potentials ...
	Sanalohit, Jinnavat; Katanyukul, Tatpong. - : arXiv, 2022
	BASE
	Show details

65	Sign Language Video Retrieval with Free-Form Textual Queries ...
	Duarte, Amanda; Albanie, Samuel; Giró-i-Nieto, Xavier. - : arXiv, 2022
	BASE
	Show details

66	All You Need In Sign Language Production ...
	Rastgoo, Razieh; Kiani, Kourosh; Escalera, Sergio. - : arXiv, 2022
	BASE
	Show details

67	Towards Zero-shot Sign Language Recognition ...
	Bilge, Yunus Can; Cinbis, Ramazan Gokberk; Ikizler-Cinbis, Nazli. - : arXiv, 2022
	BASE
	Show details

68	Sign Language Recognition System using TensorFlow Object Detection API ...
	Srivastava, Sharvani; Gangwar, Amisha; Mishra, Richa. - : arXiv, 2022
	BASE
	Show details

69	Τρισδιάστατη ανακατασκευή ανθρωπίνου σώματος, χεριών και προσώπου με εφαρμογές στην αναγνώριση νοηματικής γλώσσας ...
	Kratimenos, Angelos. - : National Technological University of Athens, 2022
	BASE
	Show details

70	Biasing Like Human: A Cognitive Bias Framework for Scene Graph Generation ...
	Chang, Xiaoguang; Wang, Teng; Sun, Changyin. - : arXiv, 2022
	BASE
	Show details

71	hate-alert@DravidianLangTech-ACL2022: Ensembling Multi-Modalities for Tamil TrollMeme Classification ...
	Das, Mithun; Banerjee, Somnath; Mukherjee, Animesh. - : arXiv, 2022
	BASE
	Show details

72	Wukong: 100 Million Large-scale Chinese Cross-modal Pre-training Dataset and A Foundation Framework ...
	Gu, Jiaxi; Meng, Xiaojun; Lu, Guansong; Hou, Lu; Niu, Minzhe; Liang, Xiaodan; Yao, Lewei; Huang, Runhui; Zhang, Wei; Jiang, Xin; Xu, Chunjing; Xu, Hang. - : arXiv, 2022
	Abstract: Vision-Language Pre-training (VLP) models have shown remarkable performance on various downstream tasks. Their success heavily relies on the scale of pre-trained cross-modal datasets. However, the lack of large-scale datasets and benchmarks in Chinese hinders the development of Chinese VLP models and broader multilingual applications. In this work, we release a large-scale Chinese cross-modal dataset named Wukong, containing 100 million Chinese image-text pairs from the web. Wukong aims to benchmark different multi-modal pre-training methods to facilitate the VLP research and community development. Furthermore, we release a group of models pre-trained with various image encoders (ViT-B/ViT-L/SwinT) and also apply advanced pre-training techniques into VLP such as locked-image text tuning, token-wise similarity in contrastive learning, and reduced-token interaction. Extensive experiments and a deep benchmarking of different downstream tasks are also provided. Experiments show that Wukong can serve as a ...
	Keyword: Computer Vision and Pattern Recognition cs.CV; FOS Computer and information sciences; Machine Learning cs.LG
	URL: https://arxiv.org/abs/2202.06767 https://dx.doi.org/10.48550/arxiv.2202.06767
	BASE
	Hide details

73	SwinTextSpotter: Scene Text Spotting via Better Synergy between Text Detection and Text Recognition ...
	Huang, Mingxin; Liu, Yuliang; Peng, Zhenghao. - : arXiv, 2022
	BASE
	Show details

74	3MASSIV: Multilingual, Multimodal and Multi-Aspect dataset of Social Media Short Videos ...
	Gupta, Vikram; Mittal, Trisha; Mathur, Puneet. - : arXiv, 2022
	BASE
	Show details

75	Taking an Emotional Look at Video Paragraph Captioning ...
	Li, Qinyu; Li, Tengpeng; Wang, Hanli. - : arXiv, 2022
	BASE
	Show details

76	EnvEdit: Environment Editing for Vision-and-Language Navigation ...
	Li, Jialu; Tan, Hao; Bansal, Mohit. - : arXiv, 2022
	BASE
	Show details

77	IGLUE: A Benchmark for Transfer Learning across Modalities, Tasks, and Languages ...
	Bugliarello, Emanuele; Liu, Fangyu; Pfeiffer, Jonas. - : arXiv, 2022
	BASE
	Show details

78	Natural Language Descriptions of Deep Visual Features ...
	Hernandez, Evan; Schwettmann, Sarah; Bau, David. - : arXiv, 2022
	BASE
	Show details

79	Finding Structural Knowledge in Multimodal-BERT ...
	Milewski, Victor; de Lhoneux, Miryam; Moens, Marie-Francine. - : arXiv, 2022
	BASE
	Show details

80	IterVM: Iterative Vision Modeling Module for Scene Text Recognition ...
	Chu, Xiaojie; Wang, Yongtao. - : arXiv, 2022
	BASE
	Show details

Page: 1 2 3 4 5 6 7 8...870

© 2013 - 2024 Lin|gu|is|tik | Imprint | Privacy Policy | Datenschutzeinstellungen ändern