Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication

Sun, Mingze; Xu, Chao; Jiang, Xinyu; Liu, Yang; Sun, Baigui; Huang, Ruqi

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2403

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication

Authors: Mingze Sun, Chao Xu, Xinyu Jiang, Yang Liu, Baigui Sun, Ruqi Huang

(Submitted on 28 Mar 2024)

Abstract: In this paper, we introduce an innovative task focused on human communication, aiming to generate 3D holistic human motions for both speakers and listeners. Central to our approach is the incorporation of factorization to decouple audio features and the combination of textual semantic information, thereby facilitating the creation of more realistic and coordinated movements. We separately train VQ-VAEs with respect to the holistic motions of both speaker and listener. We consider the real-time mutual influence between the speaker and the listener and propose a novel chain-like transformer-based auto-regressive model specifically designed to characterize real-world communication scenarios effectively which can generate the motions of both the speaker and the listener simultaneously. These designs ensure that the results we generate are both coordinated and diverse. Our approach demonstrates state-of-the-art performance on two benchmark datasets. Furthermore, we introduce the HoCo holistic communication dataset, which is a valuable resource for future research. Our HoCo dataset and code will be released for research purposes upon acceptance.

Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.19467 [cs.CV]
	(or arXiv:2403.19467v1 [cs.CV] for this version)

Submission history

From: Mingze Sun [view email]
[v1] Thu, 28 Mar 2024 14:47:32 GMT (17490kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2403.19467

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: Beyond Talking -- Generating Holistic 3D Human Dyadic Motion for Communication

Submission history