Holistic Video Understanding

@UClX_k4fC4Rl3s6L1Ikxmkag - 61 subscribers

Action recognition has been advanced in recent years by benchmarks with rich annotations. However, research is still mainly limited to human action or sports recognition - focusing on a highly specific video understanding task and thus leaving a significant gap towards describing the overall content of a video. We fill in this gap by presenting a large-scale "Holistic Video Understanding Dataset"~(HVU). HVU is organized hierarchically in a semantic taxonomy that focuses on multi-label and multi-task video understanding as a comprehensive problem that encompasses the recognition of multiple semantic aspects in the dynamic scene. HVU contains approx.~577k videos in total with ~13M annotations for training and validation set spanning over ~3k classes. HVU encompasses semantic aspects defined on categories of scenes, objects, actions, events, attributes, and concepts, which naturally capture real-world scenarios.

Home Videos Live Playlists

[CVPR 2022] Third International Workshop on Large Scale Holistic Video Understanding

3:36:24

[CVPR 2022] Third International Workshop on Large Scale Holistic Video Understanding Holistic Video Understanding

453 views - 3 years ago

[ICCV 2021] Second International Tutorial on Large Scale Holistic Video Understanding

7:35:45

[ICCV 2021] Second International Tutorial on Large Scale Holistic Video Understanding Holistic Video Understanding

987 views - 4 years ago