首页> 外文期刊>Image Processing, IEEE Transactions on >Robust Spatiotemporal Matching of Electronic Slides to Presentation Videos
【24h】

Robust Spatiotemporal Matching of Electronic Slides to Presentation Videos

机译:电子幻灯片与演示视频的鲁棒时空匹配

获取原文
获取原文并翻译 | 示例

摘要

We describe a robust and efficient method for automatically matching and time-aligning electronic slides to videos of corresponding presentations. Matching electronic slides to videos provides new methods for indexing, searching, and browsing videos in distance-learning applications. However, robust automatic matching is challenging due to varied frame composition, slide distortion, camera movement, low-quality video capture, and arbitrary slides sequence. Our fully automatic approach combines image-based matching of slide to video frames with a temporal model for slide changes and camera events. To address these challenges, we begin by extracting scale-invariant feature-transformation (SIFT) keypoints from both slides and video frames, and matching them subject to a consistent projective transformation (homography) by using random sample consensus (RANSAC). We use the initial set of matches to construct a background model and a binary classifier for separating video frames showing slides from those without. We then introduce a new matching scheme for exploiting less distinctive SIFT keypoints that enables us to tackle more difficult images. Finally, we improve upon the matching based on visual information by using estimated matching probabilities as part of a hidden Markov model (HMM) that integrates temporal information and detected camera operations. Detailed quantitative experiments characterize each part of our approach and demonstrate an average accuracy of over 95% in 13 presentation videos.
机译:我们描述了一种自动将电子幻灯片与相应演示文稿的视频进行自动匹配和时间对齐的强大而有效的方法。将电子幻灯片与视频相匹配提供了在远程学习应用程序中为视频建立索引,搜索和浏览的新方法。但是,由于帧组成变化,幻灯片失真,相机移动,低质量的视频捕获以及任意幻灯片序列,因此强大的自动匹配具有挑战性。我们的全自动方法将基于图像的幻灯片到视频帧的匹配与用于幻灯片更改和相机事件的时间模型相结合。为了解决这些挑战,我们首先从幻灯片和视频帧中提取尺度不变特征变换(SIFT)关键点,然后使用随机样本共识(RANSAC)使它们匹配一致的投影变换(单应性)。我们使用一组初始匹配项来构建背景模型和二进制分类器,以将显示幻灯片的视频帧与不显示幻灯片的视频帧分开。然后,我们引入了一种新的匹配方案,以利用独特的SIFT关键点,使我们能够处理更困难的图像。最后,我们通过使用估计的匹配概率作为整合了时间信息和检测到的相机操作的隐马尔可夫模型(HMM)的一部分,改进了基于视觉信息的匹配。详细的定量实验表征了我们方法的每个部分,并在13个演示视频中证明了95%以上的平均准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号