We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.SD

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Sound

Title: MIDGET: Music Conditioned 3D Dance Generation

Abstract: In this paper, we introduce a MusIc conditioned 3D Dance GEneraTion model, named MIDGET based on Dance motion Vector Quantised Variational AutoEncoder (VQ-VAE) model and Motion Generative Pre-Training (GPT) model to generate vibrant and highquality dances that match the music rhythm. To tackle challenges in the field, we introduce three new components: 1) a pre-trained memory codebook based on the Motion VQ-VAE model to store different human pose codes, 2) employing Motion GPT model to generate pose codes with music and motion Encoders, 3) a simple framework for music feature extraction. We compare with existing state-of-the-art models and perform ablation experiments on AIST++, the largest publicly available music-dance dataset. Experiments demonstrate that our proposed framework achieves state-of-the-art performance on motion quality and its alignment with the music.
Comments: 12 pages, 6 figures Published in AI 2023: Advances in Artificial Intelligence
Subjects: Sound (cs.SD); Computer Vision and Pattern Recognition (cs.CV); Graphics (cs.GR); Audio and Speech Processing (eess.AS)
Journal reference: In Australasian Joint Conference on Artificial Intelligence (pp. 277-288). Singapore: Springer Nature Singapore 2023
DOI: 10.1007/978-981-99-8388-9_23
Cite as: arXiv:2404.12062 [cs.SD]
  (or arXiv:2404.12062v1 [cs.SD] for this version)

Submission history

From: Jinwu Wang [view email]
[v1] Thu, 18 Apr 2024 10:20:37 GMT (4762kb,D)

Link back to: arXiv, form interface, contact.