首页> 外文期刊>Engineering Applications of Artificial Intelligence >Online training of concept detectors for image retrieval using streaming clickthrough data
【24h】

Online training of concept detectors for image retrieval using streaming clickthrough data

机译:使用流式点击数据在线培训概念检测器以进行图像检索

获取原文
获取原文并翻译 | 示例
           

摘要

Clickthrough data from image search engines provide a massive and continuously generated source of user feedback that can be used to model how the search engine users perceive the visual content. Image clickthrough data have been successfully used to build concept detectors without any manual annotation effort, although the generated annotations suffer from labeling errors. Previous research efforts therefore focused on modeling the sample uncertainty in order to improve concept detector effectiveness. In this paper, we study the problem in an online learning setting using streaming clickthrough data where each click is treated seperately when it becomes available; the concept detector model is therefore continuously updated without batch retraining. We argue that sample uncertainty can be incorporated in the online learning setting by exploiting the repetitions of incoming clicks at the classifier level, where these act as an implicit importance weighting mechanism. For online concept detector training we use the LASVM algorithm. The inferred weighting approximates the solution of batch trained concept detectors using weighted SVM variants that are known to achieve improved performance and high robustness to noise compared to the standard SVM. Furthermore, we evaluate methods for selecting negative samples using a small number of candidates sampled locally from the incoming stream of clicks. The selection criteria aim at drastically improving the performance and the convergence speed of the online concept detectors. To validate our arguments we conduct experiments for 30 concepts on the Clickture-Iite dataset The experimental results demonstrate that: (a) the proposed online approach produces effective and noise resilient concept detectors that can take advantage of streaming click-through data and achieve performance that is equivalent to Fuzzy SVM concept detectors with sample weights and 78.6% improved compared to standard SVM concept detectors; and (b) the selection criteria speed up convergence and improve effectiveness compared to random negative sampling even for a small number of available clicks (up to 134% after 100 clicks).
机译:来自图像搜索引擎的点击数据可提供大量且连续生成的用户反馈,可用于对搜索引擎用户如何感知视觉内容进行建模。图像点击数据已成功用于构建概念检测器,而无需任何人工注释工作,尽管生成的注释存在标注错误。因此,先前的研究工作集中于对样本不确定性进行建模,以提高概念检测器的有效性。在本文中,我们使用流式点击数据对在线学习环境中的问题进行了研究,在每次点击可用时,都会对其进行单独处理;因此,概念检测器模型会不断更新,而无需批量重新训练。我们认为,可以通过在分类器级别利用传入点击的重复来将样本不确定性纳入在线学习环境中,其中这些重复单击充当隐式重要性加权机制。对于在线概念检测器培训,我们使用LASVM算法。推断的权重使用加权的SVM变体来近似训练批处理概念检测器的解决方案,已知这些加权的SVM变体与标准SVM相比可实现更高的性能和更高的抗噪声能力。此外,我们评估了使用从传入点击流中本地采样的少量候选样本来选择阴性样本的方法。选择标准旨在大大提高在线概念检测器的性能和收敛速度。为了验证我们的论点,我们对Clickture-Iite数据集上的30个概念进行了实验。实验结果表明:(a)所提出的在线方法可产生有效且抗噪声的概念检测器,该检测器可利用流过点击数据并实现以下性能相当于模糊SVM概念检测器,其样本权重比标准SVM概念检测器提高了78.6%; (b)与随机否定抽样相比,即使只有少量可用点击(100次点击后高达134%),选择标准也会加快收敛速度​​并提高有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号