Abstract: Video summarization targets to extract the most important segments from a video by spatiotemporal analysis. Previous methods primarily learn content within videos based on appearance ...