Workshop 306 B

5th Workshop and Challenge on Open-World 3D Scene Understanding

Francis Engelmann, Ayca Takmaz, Alex Delitzas, Elisabetta Fedele, Anna-Maria Halacheva, Katerina Adam, Yang Miao, Jan-Nico Zaech, Zuria Bauer, Johanna Wald, Danda Pani Paudel, Or Litany, Federico Tombari, Marc Pollefeys, Leonidas Guibas

Project Page

Abstract

The ability to perceive, understand, and interact with 3D scenes is crucial for applications in AR/VR, robotics, healthcare, and beyond. Current 3D scene understanding models are largely limited to low-level recognition tasks such as object detection or semantic segmentation, and struggle to generalize beyond predefined training labels. Recently, large VLMs such as LLAVA have demonstrated impressive capabilities. Initial works have shown their potential to extend 3D scene understanding not only to open vocabulary recognition, but also reasoning about affordances, activities, and properties of unseen environments. This workshop aims to define tasks, metrics, and benchmarks to advance this emerging direction.

Video

Chat is not available.