Followed by video copy detection, temporal frame alignments of the copied video with the master contents is essential in numerous forensic applications such as, computation of geometric distortions and estimation of pirate location in a theater during illegal cam- corder captures. State-of-the-art temporal video copy registration methods are exploiting only visual features of videos, while no effort is made to employ audio signatures. Furthermore, existing studies are primarily focusing on the alignment of watermarked videos, while very few efforts are made towards non-watermarked videos. To solve these issues, this paper presents a robust tempo- ral registration scheme by utilizing visual-audio ?ngerprints, which consists of two stages: First, the video sequence is compactly represented using 1-D motion and acoustic pro?les; Second, accurate frame-to-frame matches are computed using sliding window based dynamic programming technique. Experiments on TRECVID-2008 & 2009 datasets, prove the ef?ciency and effectiveness of the proposed framework compared to the reference methods against a wide range of video editing and transformations.
展开▼