首页> 外文会议>International Conference on Pattern Recognition >Building a Multi-Modal Thesaurus from Annotated Images
【24h】

Building a Multi-Modal Thesaurus from Annotated Images

机译:从带注释图像构建多模态词库

获取原文

摘要

We propose an unsupervised approach to learn associations between low-level visual features and keywords. We assume that a collection of images is available and that each image is globally annotated. The objective is to extract representative visual profiles that correspond to frequent homogeneous regions, and to associate them with keywords. These labeled profiles would be used to build a multi-modal thesaurus that could serve as a foundation for hybrid navigation and search algorithms. Our approach has two main steps. First, each image is coarsely segmented into regions, and visual features are extracted from each region. Second, the regions are categorized using a novel algorithm that performs clustering and feature weighting simultaneously. As a result, we obtain clusters of regions that share subsets of relevant features. Representatives from each cluster and their relevant visual and textual features would be used to build a thesaurus. The proposed approach is validated using a collection of 1169 images.
机译:我们提出了一种无监督的方法来学习低级视觉功能与关键字之间的关联。我们假设图像的集合可用,并且每个图像都是全局注释的。目的是提取对应于频繁均匀区域的代表性视觉曲线,并将它们与关键字相关联。这些标记的配置文件将用于构建一个多模态词库,可以作为混合导航和搜索算法的基础。我们的方法有两个主要步骤。首先,每个图像粗略地分段为区域,并且从每个区域提取视觉特征。其次,该区域使用一种新颖算法分类,该算法同时执行群集和特征加权。因此,我们获得共享相关特征的子集的区域集群。每个集群的代表及其相关的视觉和文本功能将用于构建一个词库。使用1169个图像的集合验证了所提出的方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号