首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Towards Automatic Construction of Diverse, High-Quality Image Datasets
【24h】

Towards Automatic Construction of Diverse, High-Quality Image Datasets

机译:朝向自动构建多样化,高质量的图像数据集

获取原文
获取原文并翻译 | 示例
       

摘要

The availability of labeled image datasets has been shown critical for high-level image understanding, which continuously drives the progress of feature designing and models developing. However, constructing labeled image datasets is laborious and monotonous. To eliminate manual annotation, in this work, we propose a novel image dataset construction framework by employing multiple textual queries. We aim at collecting diverse and accurate images for given queries from the Web. Specifically, we formulate noisy textual queries removing and noisy images filtering as a multi-view and multi-instance learning problem separately. Our proposed approach not only improves the accuracy but also enhances the diversity of the selected images. To verify the effectiveness of our proposed approach, we construct an image dataset with 100 categories. The experiments show significant performance gains by using the generated data of our approach on several tasks, such as image classification, cross-dataset generalization, and object detection. The proposed method also consistently outperforms existing weakly supervised and web-supervised approaches.
机译:已标记图像数据集的可用性对于高级图像理解至关重要,这不断推动特征设计和模型开发的进度。然而,构建标记的图像数据集是费力和单调的。为了消除手动注释,在这项工作中,我们通过采用多个文本查询提出了一种新颖的图像数据集施工框架。我们的目标是收集来自网络的给定查询的多样化和准确的图像。具体来说,我们制定嘈杂的文本查询,删除和嘈杂的图像筛选作为多视图和多实例学习问题。我们所提出的方法不仅提高了准确性,而且提高了所选图像的多样性。为了验证我们提出的方法的有效性,我们构建一个具有100个类别的图像数据集。通过在几个任务中使用我们的方法的生成数据,例如图像分类,交叉数据集概括和对象检测,实验表现出显着的性能增益。该方法还始终如一地优于现有的弱监督和网络监督方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号