ICCV Poster A₀ : An Affordance-Aware Hierarchical Model for General Robotic Manipulation

Poster

A₀ : An Affordance-Aware Hierarchical Model for General Robotic Manipulation

Rongtao Xu · Jian Zhang · Minghao Guo · Youpeng Wen · Haoting Yang · Min Lin · Jianzheng Huang · Zhe Li · Kaidong Zhang · Liqiong Wang · Yuxuan Kuang · Meng Cao · Feng Zheng · Xiaodan Liang

[ Abstract ]

Abstract:

Robotic manipulation faces critical challenges in understanding spatial affordances—the "where" and "how" of object interactions—essential for complex manipulation tasks like wiping a board or stacking objects. Existing methods, including modular-based and end-to-end approaches, often lack robust spatial reasoning capabilities. Unlike recent point-based and flow-based affordance methods that focus on dense spatial representations or trajectory modeling, we propose A₀, a hierarchical affordance-aware diffusion model that decomposes manipulation task into high-level spatial affordance understanding and low-level action execution. A₀ leverages the Embodiment-Agnostic Affordance Representation, which captures object-centric spatial affordances by predicting contact point and post-contact trajectories. A₀ is pre-trained on 1 million contact points data and fine-tuned on annotated trajectories, enabling generalization across platforms. Key components include Position Offset Attention for motion-aware feature extraction and a Spatial Information Aggregation Layer for precise coordinate mapping. The model’s output is executed by the action execution module. Experiments on multiple robotic systems (Franka, Kinova, Realman and Dobot) demonstrate A₀'s superior performance in complex tasks, showcasing its efficiency, flexibility, and real-world applicability.

Live content is unavailable. Log in and register to view live content