Tutorial
From Segment Anything to Generalized Visual Grounding
Andrew Westbury · Shoubhik Debnath · Weiyao Wang · Laura Gustafson · Daniel Bolya · Xitong Yang · Kate Saenko · Chaitanya Ryali · Haitham Khedr · Christoph Feichtenhofer ·
In this tutorial, Meta AI and its academic partners will overview frontier research on visual grounding. We will cover each building block necessary to move toward future general purpose visual grounding systems, including universal image and video encoding, multimodal language understanding, semantic instance segmentation and tracking, and the latest in 3D reconstructions methods. We will provide practical guidance on using SAM open source models, resources, and tooling to tackle the field’s biggest open research problems. A new suite of SAM systems to be released this year will provide a foundation for our tutorial, offering practical entry points for each course component.
Live content is unavailable. Log in and register to view live content