Skip to yearly menu bar Skip to main content


Poster

Online Generic Event Boundary Detection

Hyung Rok Jung · Daneul Kim · Seunggyun Lim · Jeany Son · Jonghyun Choi


Abstract: Generic Event Boundary Detection (GEBD) aims to interpret long-form videos through the lens of human perception. However, current GEBD methods rely on complete video frames for prediction, which contrasts with the human ability to process information online and in real time. To bridge this gap, we introduce a new task, Online Generic Event Boundary Detection (On-GEBD), which aims to detect boundaries of generic events immediately in streaming videos. This task faces unique challenges of identifying subtle, taxonomy-free event changes in real-time, without access to future frames. To tackle these challenges, we propose a novel On-GEBD framework, $\textit{ESTimator}$, inspired by Event Segmentation Theory (EST) which explains how humans segment ongoing activity into events by leveraging the discrepancies between predicted and actual information. Our framework consists of two key components: the Consistent Event Anticipator (CEA), and the Online Boundary Discriminator (OBD). Specifically, the CEA generates a prediction of the future frame reflecting current event dynamics based solely on prior frames. Then, the OBD computes the discrepancy between the prediction and the actual incoming frame, adaptively adjusting the error threshold using statistical tests on historical errors to capture diverse and subtle event transitions. Experimental results demonstrate that $ESTimator$ outperforms all baselines adapted from recent online video understanding models and achieves performance comparable to prior offline-GEBD methods on the Kinetics-GEBD and TAPOS datasets.

Live content is unavailable. Log in and register to view live content