首页> 外文会议>SIAM International Conference on Data Mining >A Salient Ensemble of Trees using Cascaded Linear Classifiers with Feature-Cost Constraints
【24h】

A Salient Ensemble of Trees using Cascaded Linear Classifiers with Feature-Cost Constraints

机译:使用具有特征成本约束的级联线性分类器的树木突出的树

获取原文

摘要

In many applications the classification model needs to utilize limited resources properly while predicting an instance, e.g. the limited response time for a real-time search engine. In order to satisfy the resource constraint, many researchers try to simplify the model structure or shrink the feature subset size. Because the informative features may take too much cost for the model, a common way is to build a model by considering the trade-off between performance and cost. However, most previous works assume that the cost of a feature is independent of the cost of another feature, which is not practical in reality. In the paper, we consider two categories of the feature cost, individual cost and group cost. The former is independent of the cost of any other feature whereas the latter regards the cost dependency between the other features in the corresponding group. We propose a two-stage framework that integrates the cost-sensitive feature selection and learning a model with a cost budget constraint. First, we propose the group-cost-sensitive random forest (GOAT) model to consider these two costs to select a proper feature subset. Second, we propose a salient ensemble of trees each of which uses cascaded linear classifiers (ETIC) with the satisfaction of the feature-cost constraints using the derived features from the GOAT model. We conduct experiments on real-world datasets, including mobile-user preference data and object detection data. When the group cost dominates, GOAT-ETIC can gain a 10-30% improvement over the baselines. Even if the group cost is ignored, GOAT-ETIC can still get better performance than the state-of-the-arts.
机译:在许多应用中,分类模型需要正确利用有限的资源,同时预测实例,例如,实时搜索引擎的有限响应时间。为了满足资源约束,许多研究人员尝试简化模型结构或缩小特征子集大小。由于信息性功能可能会花费太多的模型成本,因此通常通过考虑性能和成本之间的权衡来构建模型。然而,最先前的作品假设特征的成本与另一个特征的成本无关,这在现实中是不实际的。在论文中,我们考虑了两类特征成本,个人成本和团体成本。前者独立于任何其他特征的成本,而后者则对应于相应组中的其他特征之间的成本依赖性。我们提出了一个两级框架,将成本敏感特征选择和学习模型与成本预算约束集成。首先,我们提出了组成本敏感的随机林(GoAT)模型,以考虑这两种成本来选择合适的特征子集。其次,我们提出了一个突出的树集合,每个树木都使用级联的线性分类器(etic)满足来自山羊模型的衍生特征的特征成本约束。我们对实际数据集进行实验,包括移动用户偏好数据和对象检测数据。当小组成本占主导地位时,山羊宿液可以通过基线获得10-30%。即使小组成本被忽略,山羊I物仍然可以比最先进的表现更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号