首页> 外文会议>Pacific-Asia Conference on Knowledge Discovery and Data Mining >Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies
【24h】

Discretization and Feature Selection Based on Bias Corrected Mutual Information Considering High-Order Dependencies

机译:考虑高阶相关性的基于偏差修正互信息的离散化和特征选择

获取原文

摘要

Mutual Information (MI) based feature selection methods are popular due to their ability to capture the nonlinear relationship among variables. However, existing works rarely address the error (bias) that occurs due to the use of finite samples during the estimation of MI. To the best of our knowledge, none of the existing methods address the bias issue for the high-order interaction term which is essential for better approximation of joint MI. In this paper, we first calculate the amount of bias of this term. Moreover, to select features using χ~2 based search, we also show that this term follows χ~2 distribution. Based on these two theoretical results, we propose Discretization and feature Selection based on bias corrected Mutual information (DSbM). DSbM is extended by adding simultaneous forward selection and backward elimination (DSbMfb). We demonstrate the superiority of DSbM over four state-of-the-art methods in terms of accuracy and the number of selected features on twenty benchmark datasets. Experimental results also demonstrate that DSbM outperforms the existing methods in terms of accuracy, Pareto Optimal-ity and Friedman test. We also observe that compared to DSbM, in some dataset DSbMfb selects fewer features and increases accuracy.
机译:基于互信息(MI)的特征选择方法之所以流行,是因为它们能够捕获变量之间的非线性关系。但是,现有的工作很少解决在估计MI时由于使用有限样本而引起的误差(偏差)。据我们所知,现有方法均未解决高阶相互作用项的偏倚问题,这对于更好地逼近关节心律是必不可少的。在本文中,我们首先计算该术语的偏差量。此外,要使用基于χ〜2的搜索选择特征,我们还表明该术语遵循χ〜2分布。基于这两个理论结果,我们提出了基于偏差校正互信息(DSbM)的离散化和特征选择。通过添加同时的前向选择和后向消除(DSbMfb)来扩展DSbM。在准确性和二十种基准数据集上选定特征的数量方面,我们证明了DSbM优于四种最新方法。实验结果还表明,DSbM在准确性,帕累托最优性和Friedman检验方面均优于现有方法。我们还观察到,与DSbM相比,在某些数据集中DSbMfb选择较少的特征并提高了准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号