A novel, low-level video frame description method is proposed that is able to compactly capture informative image statistics from luminance, color and stereoscopic disparity video data, both in a global and in various local scales. Thus, scene texture, illumination and geometry properties may succinctly be contained within a single frame feature descriptor, which can subsequently be employed as a building block in any key-frame extraction scheme, e.g., shot frame clustering. The computed key-frames are subsequently used to derive a movie summary in the form of a video skim, which is suitably post-processed to reduce stereoscopic video defects that cause visual fatigue and are a by-product of the summarization.
展开▼