Skip to yearly menu bar Skip to main content


Poster

LEGO-Maker: A Semantic-Driven Algorithm for Text-to-3D Generation

Yifei Zhang · Lei Chen


Abstract:

Driven by large-scale model iterations, the inference speed and generalization ability of 3D model generation have improved significantly. However, the quality of existing methods still falls short of enabling direct use without post-processing. Common issues include insufficient texture clarity, loss of semantic information, lack of fine-grained detail, and the generation of redundant artifacts. Moreover, current approaches focus solely on producing static structures, where individual components remain non-movable, without considering functional applications in the generation process. To address these limitations, we draw inspiration from LEGO-like modular construction and decompose complex models into semantically functional components. We propose LEGO-Maker, a novel framework that reformulates the text-to-3D task into a three-stage process: target image generation, functional semantic decomposition, and multi-task 3D generation with structured fusion. Leveraging a reorganized high-quality 3D dataset, we train a Diffusion model and a semantic segmentation model tailored for 3D generation tasks. Additionally, we design a motion-driven mechanism to introduce action sequences for functionally interactive modules after model fusion. Experimental results demonstrate that, compared to existing methods, our approach significantly enhances semantic understanding, model detail quality, and text consistency while showcasing direct applicability across various scenarios.

Live content is unavailable. Log in and register to view live content