Skip to yearly menu bar Skip to main content


Tutorial

From Segment Anything to Generalized Visual Grounding

Andrew Westbury · Shoubhik Debnath · Weiyao Wang · Laura Gustafson · Daniel Bolya · Xitong Yang · Kate Saenko · Chaitanya Ryali · Haitham Khedr · Christoph Feichtenhofer ·

[ ] [ Project Page ]
Mon 20 Oct noon PDT — 3:30 p.m. PDT

Abstract:

In this tutorial, Meta AI and its academic partners will overview frontier research on visual grounding. We will cover each building block necessary to move toward future general purpose visual grounding systems, including universal image and video encoding, multimodal language understanding, semantic instance segmentation and tracking, and the latest in 3D reconstructions methods. We will provide practical guidance on using SAM open source models, resources, and tooling to tackle the field’s biggest open research problems. A new suite of SAM systems to be released this year will provide a foundation for our tutorial, offering practical entry points for each course component.

Live content is unavailable. Log in and register to view live content