Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale Persons

Cheng, Yu; Ai, Yihao; Wang, Bo; Wang, Xinchao; Tan, Robby T.

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2208

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale Persons

Authors: Yu Cheng, Yihao Ai, Bo Wang, Xinchao Wang, Robby T. Tan

(Submitted on 25 Aug 2022 (v1), last revised 23 Nov 2022 (this version, v2))

Abstract: In multi-person 2D pose estimation, the bottom-up methods simultaneously predict poses for all persons, and unlike the top-down methods, do not rely on human detection. However, the SOTA bottom-up methods' accuracy is still inferior compared to the existing top-down methods. This is due to the predicted human poses being regressed based on the inconsistent human bounding box center and the lack of human-scale normalization, leading to the predicted human poses being inaccurate and small-scale persons being missed. To push the envelope of the bottom-up pose estimation, we firstly propose multi-scale training to enhance the network to handle scale variation with single-scale testing, particularly for small-scale persons. Secondly, we introduce dual anatomical centers (i.e., head and body), where we can predict the human poses more accurately and reliably, especially for small-scale persons. Moreover, existing bottom-up methods use multi-scale testing to boost the accuracy of pose estimation at the price of multiple additional forward passes, which weakens the efficiency of bottom-up methods, the core strength compared to top-down methods. By contrast, our multi-scale training enables the model to predict high-quality poses in a single forward pass (i.e., single-scale testing). Our method achieves 38.4\% improvement on bounding box precision and 39.1\% improvement on bounding box recall over the state of the art (SOTA) on the challenging small-scale persons subset of COCO. For the human pose AP evaluation, we achieve a new SOTA (71.0 AP) on the COCO test-dev set with the single-scale testing. We also achieve the top performance (40.3 AP) on OCHuman dataset in cross-dataset evaluation.

Comments:	28 pages, 10 figures, and 6 tables
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2208.11975 [cs.CV]
	(or arXiv:2208.11975v2 [cs.CV] for this version)

Submission history

From: Yihao Ai [view email]
[v1] Thu, 25 Aug 2022 10:09:10 GMT (32769kb,D)
[v2] Wed, 23 Nov 2022 05:03:37 GMT (32769kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2208.11975

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Bottom-Up 2D Pose Estimation via Dual Anatomical Centers for Small-Scale Persons

Submission history