首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >Egocentric Video Summarization Based on People Interaction Using Deep Learning
【24h】

Egocentric Video Summarization Based on People Interaction Using Deep Learning

机译:深度学习基于人际互动的以自我为中心的视频摘要

获取原文
           

摘要

The availability of wearable cameras in the consumer market has motivated the users to record their daily life activities and post them on the social media. This exponential growth of egocentric videos demand to develop automated techniques to effectively summarizes the first-person video data. Egocentric videos are commonly used to record lifelogs these days due to the availability of low cost wearable cameras. However, egocentric videos are challenging to process due to the fact that placement of camera results in a video which presents great deal of variation in object appearance, illumination conditions, and movement. This paper presents an egocentric video summarization framework based on detecting important people in the video. The proposed method generates a compact summary of egocentric videos that contains information of the people whom the camera wearer interacts with. Our proposed approach focuses on identifying the interaction of camera wearer with important people. We have used AlexNet convolutional neural network to filter the key-frames (frames where camera wearer interacts closely with the people). We used five convolutional layers and two completely connected hidden layers and an output layer. Dropout regularization method is used to reduce the overfitting problem in completely connected layers. Performance of the proposed method is evaluated on UT Ego standard dataset. Experimental results signify the effectiveness of the proposed method in terms of summarizing the egocentric videos.
机译:消费市场上可穿戴式相机的可用性促使用户记录他们的日常生活,并将其发布在社交媒体上。以自我为中心的视频呈指数增长,要求开发自动化技术以有效地总结第一人称视频数据。由于低成本可穿戴式摄像机的可用性,以自我为中心的视频如今已普遍用于记录生活日志。然而,由于以摄像机为中心放置视频会导致视频呈现出物体外观,照明条件和移动方面的巨大差异,因此以自我为中心的视频在处理上具有挑战性。本文提出了一个基于自我的视频摘要框架,该框架基于检测视频中的重要人物。所提出的方法生成以自我为中心的视频的紧凑摘要,其中包含与相机佩戴者进行交互的人的信息。我们提出的方法着重于识别相机佩戴者与重要人物的互动。我们已经使用AlexNet卷积神经网络来过滤关键帧(摄像机佩戴者与人们紧密互动的帧)。我们使用了五个卷积层,两个完全连接的隐藏层和一个输出层。丢弃规范化方法用于减少完全连接的层中的过拟合问题。在UT Ego标准数据集上评估了该方法的性能。实验结果表明,该方法在总结以自我为中心的视频方面是有效的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号