...
首页> 外文期刊>Multimedia Tools and Applications >Saliency-based selection of visual content for deep convolutional neural networks
【24h】

Saliency-based selection of visual content for deep convolutional neural networks

机译:基于显着的深度卷积神经网络的视觉内容选择

获取原文
获取原文并翻译 | 示例

摘要

The automatic description of digital multimedia content was mainly developed for classification tasks, retrieval systems and massive ordering of data. Preservation of cultural heritage is a field of high importance of application of these methods. We address classification problem in cultural heritage such as classification of architectural styles in digital photographs of Mexican cultural heritage. In general, the selection of relevant content in the scene for training classification models makes the models more efficient in terms of accuracy and training time. Here we use a saliency-driven approach to predict visual attention in images and use it to train a Deep Convolutional Neural Network. Also, we present an analysis of the behavior of the models trained under the state-of-the-art image cropping and the saliency maps. To train invariant models to rotations, data augmentation of training set is required, which posses problems of filling normalization of crops, we study were different padding techniques and we find an optimal solution. The results are compared with the state-of-the-art in terms of accuracy and training time. Furthermore, we are studying saliency cropping in training and generalization for another classical task such as weak labeling of massive collections of images containing objects of interest. Here the experiments are conducted on a large subset of ImageNet database. This work is an extension of preliminary research in terms of image padding methods and generalization on large scale generic database.
机译:数字多媒体内容的自动描述主要用于分类任务,检索系统和数据的大规模排序。文化遗产的保存是应用这些方法的高度重要领域。我们解决了文化遗产中的分类问题,如墨西哥文化遗产数字照片中的建筑风格分类。通常,为培训分类模型的场景中的相关内容的选择使得模型在准确性和培训时间方面更有效。在这里,我们使用显着驱动的方法来预测图像中的视觉注意,并使用它来训练深度卷积神经网络。此外,我们展示了在最先进的图像裁剪和显着性图下培训的模型的行为的分析。要培训不变模型来旋转,需要培训集的数据增强,这有可能填充作物的正常化问题,我们研究是不同的填充技术,我们找到了最佳解决方案。在准确性和培训时间方面将结果与最先进的结果进行比较。此外,我们正在研究培训和泛化的培训和泛化,例如含有感兴趣对象的巨大收集图像的弱标记。这里的实验是在一个大量的想象网数据库上进行。这项工作是在图像填充方法和大型通用数据库上的泛化方面延伸初步研究。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号