“Slidin’ Videos”: Slide Transition Detection and Title Extraction in Lecture Videos
* Register (or log in) to the AI4G Neural Network to add this session to your agenda or watch the replay
YouTube’s “Video Chapter” feature segments a video into sections marked by timestamps so that the user can easily navigate to the part of the video which is of most interest. This can be done by clicking or pressing the chapter marker, or by selecting the timestamp in the video description.
In this problem statement we want to take this feature further. In webinars where speakers present slides, participants of the problem statement are asked to create the best AI model which annotates slide transitions by:
- identifying starting and ending frames of each slide shown in the video
- extracting (apparent) titles of each slide
Recordings of 100 “AI for Good” (https://aiforgood.itu.int/) webinars were sourced to assemble a diverse collection of more than 240 video presentations made by members of the scientific community, entrepreneurs, and standardization experts.
We are providing:
- Video files covering the presentation from when speaker started screenshare right to the moment when it was turned off. Video files vary in duration (from several minutes to several hours) and resolution (from 1600×1200 to 3840×2160).
- A ground truth data set with around 3000 slide transitions showing the starting and ending frame of each slide including (apparent) titles.
This live event includes a 30-minute networking event hosted on the AI for Good Neural Network. This is your opportunity to ask questions, interact with the panelists and participants and build connections with the AI for Good community.