Skip to yearly menu bar Skip to main content


Poster

From Prompt to Progression: Taming Video Diffusion Models for Seamless Attribute Transition

Ling Lo · Kelvin Chan · Wen-Huang Cheng · Ming-Hsuan Yang


Abstract:

Existing models often struggle with complex temporal changes, particularly when generating videos with gradual attribute transitions.The most common prompt interpolation approach for motion transitions often fails to handle gradual attribute transitions, where inconsistencies tend to become more pronounced. In contrast, we extend the model to generate smooth and consistent attribute transitions by introducing frame-wise guidance for the video latent during the denoising process. Our approach constructs a data-specific transitional direction for each noisy latent, guiding the gradual shift from initial to final attributes frame by frame while preserving the motion dynamics of the video. Moreover, we present the Controlled-Attribute-Transition Benchmark (CAT-Bench), which integrates both attribute and motion dynamics, to comprehensively evaluate the performance of different models. We further propose two metrics to assess the accuracy and smoothness of attribute transitions. Experimental results demonstrate that our approach performs favorably against existing baselines, achieving visual fidelity, maintaining alignment with text prompts, and delivering seamless attribute transitions.

Live content is unavailable. Log in and register to view live content