Skip to yearly menu bar Skip to main content


Show Detail
Timezone: Pacific/Honolulu
 
Filter Rooms:  

SAT 18 OCT
1 p.m.

SUN 19 OCT
7 a.m.
Break:
(ends 9:00 AM)
8 a.m.
Workshop:
(ends 12:30 PM)
Workshop:
(ends 12:30 PM)
Workshop:
(ends 12:00 PM)
8:15 a.m.
8:55 a.m.
9:15 a.m.
Workshop:
(ends 5:30 PM)
10 a.m.
Break:
(ends 11:00 AM)
noon
Break:
(ends 1:45 PM)
1 p.m.
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:00 PM)
3 p.m.
Break:
(ends 4:00 PM)

MON 20 OCT
7 a.m.
Break:
(ends 9:00 AM)
8 a.m.
Workshop:
(ends 12:30 PM)
Workshop:
(ends 12:00 PM)
Workshop:
(ends 12:00 PM)
Workshop:
(ends 12:00 PM)
Workshop:
(ends 5:00 PM)
Workshop:
(ends 5:30 PM)
Tutorial:
(ends 12:00 PM)
8:10 a.m.
Workshop:
(ends 12:10 PM)
8:45 a.m.
Workshop:
(ends 5:00 PM)
10 a.m.
Break:
(ends 11:00 AM)
noon
Break:
(ends 1:45 PM)
3 p.m.
Break:
(ends 4:00 PM)

TUE 21 OCT
7 a.m.
Break:
(ends 9:00 AM)
(ends 5:00 PM)
8:45 a.m.
Orals 9:00-10:15
[9:00] GT-Loc: Unifying When and Where in Images Through a Joint Embedding Space
[9:15] Scaling Laws for Native Multimodal Models
[9:30] FixTalk: Taming Identity Leakage for High-Quality Talking Head Generation in Extreme Cases
[9:45] Differentiable Room Acoustic Rendering with Multi-View Vision Priors
[10:00] Token Activation Map to Visually Explain Multimodal LLMs
(ends 10:00 AM)
Orals 9:00-10:15
[9:00] Multi-View 3D Point Tracking
[9:15] Uncalibrated Structure from Motion on a Sphere
[9:30] Removing Cost Volumes from Optical Flow Estimators
[9:45] Image as an IMU: Estimating Camera Motion from a Single Motion-Blurred Image
[10:00] TrajectoryCrafter: Redirecting Camera Trajectory for Monocular Videos via Diffusion Models
(ends 10:00 AM)
10 a.m.
Break:
(ends 11:00 AM)
10:15 a.m.
Keynote:
Sheperd Doeleman
(ends 11:15 AM)
11:30 a.m.
Demonstration:
(ends 1:30 PM)
Break:
(ends 1:30 PM)
11:45 a.m.
Posters 11:45-1:45
(ends 1:45 PM)
1:30 p.m.
Orals 1:45-3:00
[1:45] Variance-Based Pruning for Accelerating and Compressing Trained Networks
[2:00] Importance-Based Token Merging for Efficient Image and Video Generation
[2:15] Knowledge Distillation for Learned Image Compression
[2:30] Distilling Diffusion Models to Efficient 3D LiDAR Scene Completion
[2:45] Heavy Labels Out! Dataset Distillation with Label Space Lightening
(ends 2:30 PM)
Orals 1:45-3:00
[1:45] RayZer: A Self-supervised Large View Synthesis Model
[2:00] EVER: Exact Volumetric Ellipsoid Rendering for Real-time View Synthesis
[2:15] Self-Ensembling Gaussian Splatting for Few-Shot Novel View Synthesis
[2:30] Back on Track: Bundle Adjustment for Dynamic Scene Reconstruction
[2:45] SceneSplat: Gaussian Splatting-based Scene Understanding with Vision-Language Pretraining
(ends 2:45 PM)
3 p.m.
Demonstration:
(ends 5:00 PM)
Posters 3:15-5:15
(ends 5:00 PM)

WED 22 OCT
7:30 a.m.
Break:
(ends 9:00 AM)
(ends 5:00 PM)
8 a.m.
Orals 8:00-9:30
[8:00] RS-vHeat: Heat Conduction Guided Efficient Remote Sensing Foundation Model
[8:15] Towards a Unified Copernicus Foundation Model for Earth Vision
[8:30] Learning Streaming Video Representation via Multitask Training
[8:45] LoftUp: Learning a Coordinate-Based Feature Upsampler for Vision Foundation Models
[9:00] Learning Visual Hierarchies in Hyperbolic Space for Image Retrieval
[9:15] GMMamba: Group Masking Mamba for Whole Slide Image Classification
(ends 9:15 AM)
Orals 8:00-9:30
[8:00] NullSwap: Proactive Identity Cloaking Against Deepfake Face Swapping
[8:15] MaskControl: Spatio-Temporal Control for Masked Motion Synthesis
[8:30] HairCUP: Hair Compositional Universal Prior for 3D Gaussian Avatars
[8:45] Understanding Co-speech Gestures in-the-wild
[9:00] DPoser-X: Diffusion Model as Robust 3D Whole-body Human Pose Prior
[9:15] Teeth Reconstruction and Performance Capture Using a Phone Camera
(ends 9:15 AM)
9:15 a.m.
Break:
(ends 10:15 AM)
9:30 a.m.
Keynote:
Brent Seales
(ends 10:30 AM)
10:45 a.m.
Demonstration:
(ends 12:45 PM)
Posters 10:45-1:15
(ends 12:45 PM)
11 a.m.
Break:
(ends 1:00 PM)
1 p.m.
Orals 1:15-2:30
[1:15] Forecasting Continuous Non-Conservative Dynamical Systems in SO(3)
[1:30] Certifiably Optimal Anisotropic Rotation Averaging
[1:45] Deterministic Object Pose Confidence Region Estimation
[2:00] RePoseD: Efficient Relative Pose Estimation With Known Depth Information
[2:15] Diving into the Fusion of Monocular Priors for Generalized Stereo Matching
(ends 2:15 PM)
Orals 1:15-2:30
[1:15] Dynamic Typography: Bringing Text to Life via Video Diffusion Prior
[1:30] Generating Physically Stable and Buildable Brick Structures from Text
[1:45] WIR3D: Visually-Informed and Geometry-Aware 3D Shape Abstraction
[2:00] SparseFlex: High-Resolution and Arbitrary-Topology 3D Shape Modeling
[2:15] ReCamMaster: Camera-Controlled Generative Rendering from A Single Video
(ends 2:15 PM)
2:30 p.m.
Demonstration:
(ends 4:30 PM)
Posters 2:45-4:45
(ends 4:00 PM)
4:45 p.m.
Meeting:
(ends 5:45 PM)
6:30 p.m.
Reception:
(ends 8:00 PM)

THU 23 OCT
7:30 a.m.
Break:
(ends 9:00 AM)
(ends 2:00 PM)
8 a.m.
Orals 8:00-9:30
[8:00] ROAR: Reducing Inversion Error in Generative Image Watermarking
[8:15] Moto: Latent Motion Token as the Bridging Language for Learning Robot Manipulation from Videos
[8:30] Automated Model Evaluation for Object Detection via Prediction Consistency and Reliability
[8:45] Counting Stacked Objects
[9:00] MIORe & VAR-MIORe: Benchmarks to Push the Boundaries of Restoration
[9:15] Soft Local Completeness: Rethinking Completeness in XAI
(ends 9:15 AM)
Orals 8:00-9:30
[8:00] LaRender: Training-Free Occlusion Control in Image Generation via Latent Rendering
[8:15] MikuDance: Animating Character Art with Mixed Motion Dynamics
[8:30] Diffusion Transformer meets Multi-level Wavelet Spectrum for Single Image Super-Resolution
[8:45] LOTS of Fashion! Multi-Conditioning for Image Generation via Sketch-Text Pairing
[9:00] FlowEdit: Inversion-Free Text-Based Editing Using Pre-Trained Flow Models
[9:15] LayerTracer: Cognitive-Aligned Layered SVG Synthesis via Diffusion Transformer
(ends 9:15 AM)
10:45 a.m.
Posters 11:15-1:15
(ends 12:45 PM)
Demonstration:
(ends 12:45 PM)
11 a.m.
Break:
(ends 1:00 PM)
1 p.m.
Orals 1:15-2:30
[1:15] CorrCLIP: Reconstructing Patch Correlations in CLIP for Open-Vocabulary Semantic Segmentation
[1:30] E-SAM: Training-Free Segment Every Entity Model
[1:45] Online Reasoning Video Segmentation with Just-in-Time Digital Twins
[2:00] Easy3D: A Simple Yet Effective Method for 3D Interactive Segmentation
[2:15] ForestFormer3D: A Unified Framework for End-to-End Segmentation of Forest LiDAR 3D Point Clouds
(ends 2:15 PM)
Orals 1:15-2:30
[1:15] SuperDec: 3D Scene Decomposition with Superquadrics Primitives
[1:30] Diffusion Image Prior
[1:45] Spatially-Varying Autofocus
[2:00] Towards Foundational Models for Single-Chip Radar
[2:15] Event-based Visual Vibrometry
(ends 2:15 PM)
2:30 p.m.
Posters 2:30-4:45
(ends 4:30 PM)
Demonstration:
(ends 4:30 PM)