OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

Zhan, Xinyu; Yang, Lixin; Zhao, Yifei; Mao, Kangrui; Xu, Hanlin; Lin, Zenan; Li, Kailin; Lu, Cewu

Full-text links:

Download:

Current browse context:

cs.CV

< prev | next >

new | recent | 2403

Change to browse by:

Computer Science > Computer Vision and Pattern Recognition

Title: OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

Authors: Xinyu Zhan, Lixin Yang, Yifei Zhao, Kangrui Mao, Hanlin Xu, Zenan Lin, Kailin Li, Cewu Lu

(Submitted on 28 Mar 2024)

Abstract: We present OAKINK2, a dataset of bimanual object manipulation tasks for complex daily activities. In pursuit of constructing the complex tasks into a structured representation, OAKINK2 introduces three level of abstraction to organize the manipulation tasks: Affordance, Primitive Task, and Complex Task. OAKINK2 features on an object-centric perspective for decoding the complex tasks, treating them as a sequence of object affordance fulfillment. The first level, Affordance, outlines the functionalities that objects in the scene can afford, the second level, Primitive Task, describes the minimal interaction units that humans interact with the object to achieve its affordance, and the third level, Complex Task, illustrates how Primitive Tasks are composed and interdependent. OAKINK2 dataset provides multi-view image streams and precise pose annotations for the human body, hands and various interacting objects. This extensive collection supports applications such as interaction reconstruction and motion synthesis. Based on the 3-level abstraction of OAKINK2, we explore a task-oriented framework for Complex Task Completion (CTC). CTC aims to generate a sequence of bimanual manipulation to achieve task objectives. Within the CTC framework, we employ Large Language Models (LLMs) to decompose the complex task objectives into sequences of Primitive Tasks and have developed a Motion Fulfillment Model that generates bimanual hand motion for each Primitive Task. OAKINK2 datasets and models are available at this https URL

Comments:	To be appeared in CVPR 2024. 26 pages
Subjects:	Computer Vision and Pattern Recognition (cs.CV)
Cite as:	arXiv:2403.19417 [cs.CV]
	(or arXiv:2403.19417v1 [cs.CV] for this version)

Submission history

From: Xinyu Zhan [view email]
[v1] Thu, 28 Mar 2024 13:47:19 GMT (23468kb,D)

Which authors of this paper are endorsers? | Disable MathJax (What is MathJax?)

Link back to: arXiv, form interface, contact.

> cs > arXiv:2403.19417

Download:

Current browse context:

Change to browse by:

References & Citations

DBLP - CS Bibliography

Bookmark

Computer Science > Computer Vision and Pattern Recognition

Title: OAKINK2: A Dataset of Bimanual Hands-Object Manipulation in Complex Task Completion

Submission history