首页> 外文OA文献 >A framework for automatic semantic video annotation: Utilizing similarity and commonsense knowledge bases
【2h】

A framework for automatic semantic video annotation: Utilizing similarity and commonsense knowledge bases

机译:自动语义视频注释的框架:利用相似性和常识性知识库

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The rapidly increasing quantity of publicly available videos has driven research into developing automatic tools for indexing, rating, searching and retrieval. Textual semantic representations, such as tagging, labelling and annotation, are often important factors in the process of indexing any video, because of their user-friendly way of representing the semantics appropriate for search and retrieval. Ideally, this annotation should be inspired by the human cognitive way of perceiving and of describing videos. The difference between the low-level visual contents and the corresponding human perception is referred to as the ‘semantic gap’. Tackling this gap is even harder in the case of unconstrained videos, mainly due to the lack of any previous information about the analyzed video on the one hand, and the huge amount of generic knowledge required on the other. This paper introduces a framework for the Automatic Semantic Annotation of unconstrained videos. The proposed framework utilizes two non-domain-specific layers: low-level visual similarity matching, and an annotation analysis that employs commonsense knowledgebases. Commonsense ontology is created by incorporating multiple-structured semantic relationships. Experiments and black-box tests are carried out on standard video databases for action recognition and video information retrieval. White-box tests examine the performance of the individual intermediate layers of the framework, and the evaluation of the results and the statistical analysis show that integrating visual similarity matching with commonsense semantic relationships provides an effective approach to automated video annotation.
机译:公开视频数量的迅速增长推动了对开发用于索引,评级,搜索和检索的自动工具的研究。文本语义表示(例如,标签,标签和注释)通常是索引任何视频过程中的重要因素,因为它们以用户友好的方式表示适合搜索和检索的语义。理想情况下,此注释应受人类感知和描述视频的认知方式的启发。低级视觉内容与相应的人类感知之间的差异称为“语义鸿沟”。在视频不受限制的情况下,解决这一差距更加困难,这主要是由于一方面缺乏有关已分析视频的任何先前信息,另一方面则需要大量的通用知识。本文介绍了一种用于不受约束的视频的自动语义注释的框架。所提出的框架利用了两个非域特定的层:低级视觉相似性匹配,以及使用常识知识库的注释分析。常识本体是通过并入多种结构的语义关系而创建的。在标准视频数据库上进行了实验和黑盒测试,以进行动作识别和视频信息检索。白盒测试检查了框架各个中间层的性能,对结果的评估和统计分析表明,将视觉相似性匹配与常识语义关系集成在一起,可以提供一种有效的自动视频注释方法。

著录项

  • 作者

    Altadmri Amjad; Ahmed Amr;

  • 作者单位
  • 年度 2014
  • 总页数
  • 原文格式 PDF
  • 正文语种 {"code":"en","name":"English","id":9}
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号