Skip to yearly menu bar Skip to main content


Poster

From One to More: Contextual Part Latents for 3D Generation

Shaocong Dong · Lihe Ding · Xiao Chen · Yaokun Li · Yuxin WANG · Yucheng Wang · Qi WANG · Jaehyeok Kim · Chenjian Gao · Zhanpeng Huang · Zibin Wang · Tianfan Xue · Dan Xu


Abstract:

To generate 3D objects, early research focused on multi-view-driven approaches relying solely on 2D renderings. Recently, the 3D native latent diffusion paradigm has demonstrated superior performance in 3D generation, because it fully leverages the geometric information provided in ground truth 3D data. Despite its fast development, 3D diffusion still faces three challenges. First, the majority of these methods represent a 3D object by one single latent, regardless of its complexity. This may lead to detail loss when generating 3D objects with multiple complicated parts. Second, most 3D assets are designed parts by parts, yet the current holistic latent representation overlooks the independence of these parts and their interrelationships, limiting the model's generative ability. Third, current methods rely on global conditions (e.g., text, image, point cloud) to control the generation process, lacking detailed controllability. Therefore, motivated by how 3D designers create a 3D object, we present a new part-based 3D generation framework, CoPart, which represents a 3D object with multiple contextual part latents and simultaneously generates coherent 3D parts. This part-based framework has several advantages, including: i) reduces the encoding burden of intricate objects by decomposing them into simpler parts, ii) facilitates part learning and part relationship modeling, and iii) naturally supports part-level control. Furthermore, to ensure the coherence of part latents and to harness the powerful priors from foundation models, we propose a novel mutual guidance strategy to fine-tune pre-trained diffusion models for joint part latent denoising. Benefiting from the part-based representation, we demonstrate that CoPart can support various applications including part-editing, articulated object generation, and mini-scene generation. Moreover, we collect a new large-scale 3D part dataset named Partverse from Objaverse through automatic mesh segmentation and subsequent human post-annotations. By training on the proposed dataset, CoPart achieves promising part-based 3D generation with high controllability.

Live content is unavailable. Log in and register to view live content