首页> 外文学位 >Bridging the semantic gap : Image and video understanding by exploiting attributes.
【24h】

Bridging the semantic gap : Image and video understanding by exploiting attributes.

机译:弥合语义鸿沟:通过利用属性来理解图像和视频。

获取原文
获取原文并翻译 | 示例

摘要

Understanding image and video is one of the fundamental problems in the field of computer vision. Traditionally, the research in this area focused on extracting low level features from images and videos and learning classifiers to categorize these features to pre-defined classes of objects, scenes or activities. However, it is well known that there exists a "semantic gap'' between low level features and high level semantic concepts, which greatly obstructs the progress of research on image and video understanding.;Our work departs from the traditional view of image and video understanding in that we add a middle layer between high level concepts and low level features, which is called as attribute, and use this layer to facilitate the description of concepts and detection of entities from images and videos. On one hand, attributes are relatively simple and thus can be more reliably detected from the low level features; on the other hand, we can exploit high level knowledge about the relationship between the attributes and the high level concepts and the relationship among attributes, and therefore reduce the semantic gap. Our ideas are demonstrates in three applications as follows:;First, we presented an attribute-based learning approach for object recognition, where attributes are used to transfer knowledge on object properties from known classes to unknown classes and consequently reduce the number of training examples needed to learn the new object classes.;Next, we illustrate an active framework to recognize scenes based on the objects therein, which are considered as the attributes of the scenes. The active framework utilizes the correlation among objects in a scene and thus significantly reduces the number of objects to be detected in order to recognize the scene.;Finally, we propose a novel approach to detect the activity attributes from sports videos, where the contextual constraints are explored to decrease the ambiguity in attribute detection. The activity attributes enable us to go beyond naming the activity categories and achieve a fine-grained description of the activities in the videos.
机译:了解图像和视频是计算机视觉领域的基本问题之一。传统上,该领域的研究重点是从图像和视频中提取低级特征,并学习分类器以将这些特征归类为对象,场景或活动的预定义类别。但是,众所周知,低级特征和高级语义概念之间存在“语义鸿沟”,这极大地阻碍了图像和视频理解研究的进展。理解上,我们在高级概念和低级特征之间添加了一个中间层,称为属性,并使用该层来方便概念的描述以及从图像和视频中检测实体。一方面,属性相对简单因此,可以从低级特征中更可靠地检测到它们;另一方面,我们可以利用有关属性与高级概念之间的关系以及属性之间的关系的高级知识,从而减少语义上的差距。在以下三个应用程序中进行了演示:首先,我们提出了一种基于属性的对象识别学习方法,其中属性用于将对象属性的知识从已知类转移到未知类,从而减少学习新对象类所需的训练示例的数量。接下来,我们说明一个主动框架,用于基于其中的对象识别场景,这些场景被视为属性的场景。主动框架利用场景中对象之间的相关性,从而显着减少了要识别场景的检测对象的数量。最后,我们提出了一种新颖的方法来检测体育视频中的活动属性,其中存在上下文约束探索以减少属性检测中的歧义。活动属性使我们不仅可以命名活动类别,还可以对视频中的活动进行细粒度的描述。

著录项

  • 作者

    Yu, Xiaodong.;

  • 作者单位

    University of Maryland, College Park.;

  • 授予单位 University of Maryland, College Park.;
  • 学科 Engineering Electronics and Electrical.;Computer Science.
  • 学位 Ph.D.
  • 年度 2013
  • 页码 106 p.
  • 总页数 106
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号