Performance evaluation of large-scale object recognition system using bag-of-visual words model

Min-Uk Kim; Kyoungro Yoon

首页> 外文期刊>Multimedia Tools and Applications >Performance evaluation of large-scale object recognition system using bag-of-visual words model

【24h】

Performance evaluation of large-scale object recognition system using bag-of-visual words model

机译：基于视觉袋模型的大型目标识别系统性能评估

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

文献代查 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Object recognition technology is usually used for recognizing specific objects, such as book covers, landmarks, vehicles, etc. This technology is supported by multi-dimensional local image descriptors in most situations. These descriptors are designed to be robust to the environmental changes, such as illumination change, view angle change, scale change, etc. If there are many target objects in your database, object recognition using large scale local image descriptor database may not be a trivial task, because of the high dimensionality of the local image descriptors. For consistent responses from a large-scale database with a reasonable time delay, we need to have a proper data structure which supports the indexing and querying functionality. A vocabulary tree is a data structure based on local image descriptors, and this data structure is commonly used to cope with massive databases containing local image descriptors. By using a vocabulary tree, a local image descriptor can be mapped to a vocabulary tree's leaf node ID, constructing a visual word for object recognition. Visual words are then effectively exploited by a traditional text retrieval engine. In this study, we built a large-scale object recognition system using a vocabulary tree that had leaf nodes of 1 million Scale-Invariant Feature Transform (SIFT) descriptors, which is the most promising local image descriptor in terms of precision. We implement proposed system using publicly available software so that further enhancements and/or reproducibility would be easily accomplished. We then compared and evaluated the proposed system's performance with the current MPEG CDVS (Compact Descriptors for Visual Search) standard using a database containing two dimensional planar object datasets of three categories with one million distracter images. In addition to these datasets, which are equivalent to those of CDVS, we add a new dataset which are made to mimic realistic occlusion and clutter effects. Experimental results show that our proposed system's performance is comparable to that of the CDVS achieving 90 % precision at 5 s retrieval time. We also find characteristics of vocabulary tree limiting adaptation to a specific application domain.

机译：对象识别技术通常用于识别特定对象，例如书的封面，地标，车辆等。在大多数情况下，多维本地图像描述符支持该技术。这些描述符旨在对环境变化（例如照明变化，视角变化，比例变化等）具有鲁棒性。如果数据库中有许多目标对象，那么使用大规模本地图像描述符数据库进行对象识别可能就不那么容易了。任务，因为局部图像描述符的维数很高。为了在合理的时间延迟下获得来自大型数据库的一致响应，我们需要具有支持索引和查询功能的适当数据结构。词汇树是基于本地图像描述符的数据结构，该数据结构通常用于处理包含本地图像描述符的海量数据库。通过使用词汇树，可以将本地图像描述符映射到词汇树的叶节点ID，从而构建用于对象识别的可视单词。然后，传统的文本检索引擎可以有效地利用视觉单词。在这项研究中，我们使用词汇树构建了一个大型对象识别系统，该词汇树的叶子节点具有100万个尺度不变特征变换（SIFT）描述符，从精度上来说，这是最有前途的局部图像描述符。我们使用公开可用的软件实施建议的系统，以便可以轻松实现进一步的增强和/或再现性。然后，我们使用一个数据库进行了比较，并使用当前的MPEG CDVS（用于视觉搜索的紧凑描述符）标准对提议的系统的性能进行了评估，该数据库包含具有三个类别的二维平面对象数据集和一百万个干扰图像。除了这些与CDVS等效的数据集外，我们还添加了一个新的数据集，用于模拟逼真的遮挡和杂波效果。实验结果表明，我们提出的系统的性能与CDVS在5 s的检索时间内达到90％的精度相当。我们还发现了词汇树的特征，该特征限制了对特定应用领域的适应。

著录项

来源
《Multimedia Tools and Applications》 |2015年第7期|2499-2517|共19页
作者
Min-Uk Kim; Kyoungro Yoon;
展开▼
作者单位

School of Computer Science and Engineering, Konkuk University, Seoul, South Korea 143-701;

School of Computer Science and Engineering, Konkuk University, Seoul, South Korea 143-701;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Object recognition; Bag-of-visual words; Sift; Vocabulary tree; CDVS; Standard;

机译：对象识别;视觉词袋;筛;词汇树;CDVS;标准;

相似文献

外文文献
中文文献
专利

1. A tale of two recognition systems: implications of the fusiform face area and the visual word form area for lateralized object recognition models. [J] . Dien J Neuropsychologia . 2009,第1期

机译：关于两个识别系统的故事：梭形脸区域和视觉单词形式区域对侧向对象识别模型的影响。
2. Comparison of Performance of Enhanced Morpheme-based Language Model with Different Word-based Language Models for Improving the Performance of Tamil Speech Recognition System [J] . S. SARASWATHI, T.V. GEETHA ACM transactions on Asian language information processing . 2007,第3期

机译：增强的基于词素的语言模型与不同的基于单词的语言模型的性能比较，以提高泰米尔语语音识别系统的性能
3. The 5-HT(1A) receptor antagonist WAY 100635 improves rats performance in different models of amnesia evaluated by the object recognition task. [J] . Pitsikas N, Rigamonti AE, Cella SG, Brain research . 2003,第1a2期

机译：5-HT（1A）受体拮抗剂WAY 100635可改善通过对象识别任务评估的不同失忆模型大鼠的表现。
4. Appropriate Farsi speech recognizer for commanding robots: (Performance evaluation of correlation-based and model-based classifiers for a Farsi isolated word recognition robotic system) [C] . Rashedi Ashkan, Moghaddam Shahriar Shirvani IEEE 10th International Conference on Signal Processing . 2010

机译：适用于指挥机器人的波斯语语音识别器：（波斯语隔离单词识别机器人系统的基于相关性和基于模型的分类器的性能评估）
5. A systematic evaluation of object detection and recognition approaches with context capabilities [D] . Giusti Urbina, Rafael J. 2011

机译：具有上下文能力的对象检测和识别方法的系统评估
6. Modeling Spoken Word Recognition Performance by Pediatric Cochlear Implant Users using Feature Identification [O] . Stefan A. Frisch, David B. Pisoni -1

机译：使用特征识别对小儿人工耳蜗植入用户的口语单词识别性能进行建模
7. A Model of Invariant Object Recognition in the Visual System: Learning Rules, Activation Functions, Lateral Inhibition, and Information-Based Performance Measures [O] . Edmund T. Rolls, T. Milward 1996

机译：视觉系统中不变对象识别的模型：学习规则，激活函数，横向抑制和基于信息的性能度量
8. Design and Performance of a Large Vocabulary Discrete Word Recognition System. Volume 1: Technical Report [R] . 1973

机译：大词汇离散词识别系统的设计与实现。第1卷：技术报告

Performance evaluation of large-scale object recognition system using bag-of-visual words model

摘要

著录项

相似文献

相关主题

期刊订阅