A Survey on Backbones for Deep Video Action Recognition

Tang, Zixuan; Zhao, Youjun; Wen, Yuhang; Liu, Mengyuan

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2405

Computer Science > Computer Vision and Pattern Recognition

Title: A Survey on Backbones for Deep Video Action Recognition

Authors: Zixuan Tang, Youjun Zhao, Yuhang Wen, Mengyuan Liu

(Submitted on 9 May 2024)

Abstract: Action recognition is a key technology in building interactive metaverses. With the rapid development of deep learning, methods in action recognition have also achieved great advancement. Researchers design and implement the backbones referring to multiple standpoints, which leads to the diversity of methods and encountering new challenges. This paper reviews several action recognition methods based on deep neural networks. We introduce these methods in three parts: 1) Two-Streams networks and their variants, which, specifically in this paper, use RGB video frame and optical flow modality as input; 2) 3D convolutional networks, which make efforts in taking advantage of RGB modality directly while extracting different motion information is no longer necessary; 3) Transformer-based methods, which introduce the model from natural language processing into computer vision and video understanding. We offer objective sights in this review and hopefully provide a reference for future research.

Comments:	This paper has been accepted by ICME workshop
Subjects:	Computer Vision and Pattern Recognition (cs.CV); Artificial Intelligence (cs.AI)
Cite as:	arXiv:2405.05584 [cs.CV]
	(or arXiv:2405.05584v1 [cs.CV] for this version)

Submission history

From: Zixuan Tang [view email]
[v1] Thu, 9 May 2024 07:20:36 GMT (13395kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2405.05584

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: A Survey on Backbones for Deep Video Action Recognition

Submission history