We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: Action Segmentation Using 2D Skeleton Heatmaps and Multi-Modality Fusion

Abstract: This paper presents a 2D skeleton-based action segmentation method with applications in fine-grained human activity recognition. In contrast with state-of-the-art methods which directly take sequences of 3D skeleton coordinates as inputs and apply Graph Convolutional Networks (GCNs) for spatiotemporal feature learning, our main idea is to use sequences of 2D skeleton heatmaps as inputs and employ Temporal Convolutional Networks (TCNs) to extract spatiotemporal features. Despite lacking 3D information, our approach yields comparable/superior performances and better robustness against missing keypoints than previous methods on action segmentation datasets. Moreover, we improve the performances further by using both 2D skeleton heatmaps and RGB videos as inputs. To our best knowledge, this is the first work to utilize 2D skeleton heatmap inputs and the first work to explore 2D skeleton+RGB fusion for action segmentation.
Comments: Accepted to ICRA 2024
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2309.06462 [cs.CV]
  (or arXiv:2309.06462v3 [cs.CV] for this version)

Submission history

From: Quoc-Huy Tran [view email]
[v1] Tue, 12 Sep 2023 17:56:06 GMT (657kb,D)
[v2] Tue, 19 Sep 2023 00:01:06 GMT (731kb,D)
[v3] Fri, 26 Apr 2024 02:53:13 GMT (733kb,D)

Link back to: arXiv, form interface, contact.