首页> 外文会议>International Conference on Systems and Informatics >An improved XGBoost based on weighted column subsampling for object classification
【24h】

An improved XGBoost based on weighted column subsampling for object classification

机译:改进的基于加权列二次采样的XGBoost用于对象分类

获取原文

摘要

Object classification is one of main problems in computer vision and lots of state-of-the-art machine learning methods have been proposed recently. Traditional classifier may have the drawback of low effectiveness for image representations. In this paper, an improved eXtreme Gradient Boosting (XGBoost) based on weighted column (feature) subsampling is proposed to classify image representations for object classification. The contributions of the paper are as follows. First, Convolutional Neural Networks (CNN) pre-trained on large-scale image database ILSVRC and fine-tuned on PASCAL VOC 2012 dataset is used to extract features from object images. Besides, we concatenate multiple layers of learned features to obtain more information of image contents for determining their categories. Second, due to the high dimensional of extracted features, feature redundancy will unavoidably be incurred, so a weighted column subsampling method is proposed to applied to XGBoost algorithm, in order to draw the features by the importance during growing each tree. Furthermore, our proposed method is general, which can be easily extended to other methods which need column subsampling, when the feature of data is high dimensional and redundant. Finally, we validate performance of proposed method on the PASCAL VOC 2007 dataset. Compared with four typical methods, the proposed method has superior average precision (AP) on 5 out of 20 classes, e.g. the AP of person class we obtained is 92.1, 1% higher than that of best method mentioned above.
机译:对象分类是计算机视觉中的主要问题之一,最近提出了许多最新的机器学习方法。传统的分类器可能具有图像表示效率低的缺点。本文提出了一种基于加权列(特征)二次采样的改进的eXtreme Gradient Boosting(XGBoost)来对图像表示进行分类,以进行对象分类。本文的贡献如下。首先,在大规模图像数据库ILSVRC上进行预训练并在PASCAL VOC 2012数据集上进行了微调的卷积神经网络(CNN)用于从目标图像中提取特征。此外,我们将学习特征的多层连接起来,以获得更多的图像内容信息,以确定它们的类别。其次,由于提取特征的维数高,不可避免地会导致特征冗余,因此,提出了一种加权列二次采样方法应用于XGBoost算法,以根据每棵树在生长过程中的重要性来绘制特征。此外,我们提出的方法是通用的,当数据的特征是高维且冗余时,可以很容易地扩展到需要列二次采样的其他方法。最后,我们在PASCAL VOC 2007数据集上验证了所提出方法的性能。与四种典型方法相比,所提出的方法在20个类别中的5个类别(例如20个类别)上具有更高的平均精度(AP)。我们获得的人员类别的AP为92.1,比上述最佳方法的AP高1%。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号