Skip to yearly menu bar Skip to main content


Poster

Towards Comprehensive Lecture Slides Understanding: Large-scale Dataset and Effective Method

Yuzhe Li · Yuzhe Li · Yuliang Liu · Yingying Zhu · Xiang Bai


Abstract:

Online education has been widespread in worldwide universities and educational institutions. Lecture slides, a fundamental component of online education, contain a wealth of information, playing a crucial role in learning.However, previous works have not yet paid sufficient attention to understanding lecture slides, including the absence of the large-scale dataset and comprehensive understanding tasks. To facilitate the research about lecture slides understanding, we establish the LecSlides-370K, which consists of 25,542 lectures with 370,078 slides across 15 areas. We also introduce two comprehensive tasks, Lecture Summary and Lecture Question Answering (QA), for providing different perspectives of slides understanding. Furthermore, complex and flexible text relations can hinder the understanding of the internal logic of slides. To address this challenge, we propose a novel method, named SlideParser, which includes an auxiliary branch to predict text relations within slides and enhance attention between related texts, thereby improving slides understanding. With extensive experiments, we show the superiority of our proposed method on both LecSlides-370k and SlideVQA. Dataset and code will be released soon.

Live content is unavailable. Log in and register to view live content