Skip to yearly menu bar Skip to main content


Poster

WorldScore: Unified Evaluation Benchmark for World Generation

Haoyi Duan · Hong-Xing Yu · Sirui Chen · Li Fei-Fei · Jiajun Wu


Abstract:

We introduce WorldScore benchmark, the first unified benchmark for world generation. We decompose world generation into a sequence of next-scene generation tasks with explicit camera trajectory-based layout specifications, enabling unified evaluation of diverse approaches from 3D and 4D scene generation to video generation models. The WorldScore benchmark encompasses a curated dataset of 3,000 test examples that span diverse worlds: indoor and outdoor, static and dynamic, photorealistic and stylized. The WorldScore metric evaluates generated worlds through three key aspects: controllability, quality, and dynamics. Through extensive evaluation of 19 representative models, including both open-source and closed-source implementations, we reveal key insights and challenges for each category of models. We will open-source WorldScore, including evaluation metrics, datasets, and generated videos.

Live content is unavailable. Log in and register to view live content