We gratefully acknowledge support from
the Simons Foundation and member institutions.
Full-text links:

Download:

Current browse context:

cs.CV

Change to browse by:

cs

References & Citations

DBLP - CS Bibliography

Bookmark

(what is this?)
CiteULike logo BibSonomy logo Mendeley logo del.icio.us logo Digg logo Reddit logo

Computer Science > Computer Vision and Pattern Recognition

Title: BAMM: Bidirectional Autoregressive Motion Model

Abstract: Generating human motion from text has been dominated by denoising motion models either through diffusion or generative masking process. However, these models face great limitations in usability by requiring prior knowledge of the motion length. Conversely, autoregressive motion models address this limitation by adaptively predicting motion endpoints, at the cost of degraded generation quality and editing capabilities. To address these challenges, we propose Bidirectional Autoregressive Motion Model (BAMM), a novel text-to-motion generation framework. BAMM consists of two key components: (1) a motion tokenizer that transforms 3D human motion into discrete tokens in latent space, and (2) a masked self-attention transformer that autoregressively predicts randomly masked tokens via a hybrid attention masking strategy. By unifying generative masked modeling and autoregressive modeling, BAMM captures rich and bidirectional dependencies among motion tokens, while learning the probabilistic mapping from textual inputs to motion outputs with dynamically-adjusted motion sequence length. This feature enables BAMM to simultaneously achieving high-quality motion generation with enhanced usability and built-in motion editability. Extensive experiments on HumanML3D and KIT-ML datasets demonstrate that BAMM surpasses current state-of-the-art methods in both qualitative and quantitative measures. Our project page is available at this https URL
Subjects: Computer Vision and Pattern Recognition (cs.CV)
Cite as: arXiv:2403.19435 [cs.CV]
  (or arXiv:2403.19435v3 [cs.CV] for this version)

Submission history

From: Ekkasit Pinyoanuntapong [view email]
[v1] Thu, 28 Mar 2024 14:04:17 GMT (32457kb,D)
[v2] Fri, 29 Mar 2024 11:15:04 GMT (32457kb,D)
[v3] Mon, 1 Apr 2024 13:02:20 GMT (32457kb,D)

Link back to: arXiv, form interface, contact.