ICCV Poster InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models

Poster

InfiniCube: Unbounded and Controllable Dynamic 3D Driving Scene Generation with World-Guided Video Models

Yifan Lu · Xuanchi Ren · Jiawei Yang · Tianchang Shen · Jay Zhangjie Wu · Jun Gao · Yue Wang · Siheng Chen · Mike Chen · Sanja Fidler · Jiahui Huang

Exhibit Hall I #2522

[ Abstract ] [ Project Page ]

Thu 23 Oct 5:45 p.m. PDT — 7:45 p.m. PDT

Abstract:

We present InfiniCube, a scalable and controllable method to generate unbounded and dynamic 3D driving scenes with high fidelity.Previous methods for scene generation are constrained either by their applicability to indoor scenes or by their lack of controllability.In contrast, we take advantage of recent advances in 3D and video generative models to achieve large dynamic scene generation that allows flexible controls through HD maps, vehicle bounding boxes, and text descriptions.First, we construct a map-conditioned 3D voxel generative model to unleash its power for unbounded voxel world generation. Then, we re-purpose a video model and ground it on the voxel world through a set of pixel-aligned guidance buffers, synthesizing a consistent appearance on long-video generation for large-scale scenes.Finally, we propose a fast feed-forward approach that employs both voxel and pixel branches to lift videos to dynamic 3D Gaussians with controllable objects.Our method can generate controllable and realistic 3D driving scenes, and extensive experiments validate the effectiveness of our model design. Code will be released upon acceptance.

Live content is unavailable. Log in and register to view live content