首页> 外文期刊>Neurocomputing >Algorithmic randomness based feature selection for traditional Chinese chronic gastritis diagnosis
【24h】

Algorithmic randomness based feature selection for traditional Chinese chronic gastritis diagnosis

机译:基于算法随机性的中医慢性胃炎诊断特征选择

获取原文
获取原文并翻译 | 示例

摘要

Machine learning methods involving multivariate interacting effects have become mainstream in feature selection. However, the feature importance score generated by machine learning methods is not statistically interpretable, which hampers its application in practice like medical diagnosis. In this study, a framework of Algorithmic Randomness based Feature Selection (ARFS) is proposed to measure the feature importance score using the p-value which derives from the combination of algorithmic randomness test and machine learning methods. In ARFS, a machine learning algorithm, such as random forest (RF), support vector machine (SVM) and naive Bayes classifier (NB) is used to compute the nonconformity score of each example belonging to data distribution, and then the p-value from algorithmic randomness test is obtained from nonconformity scores. ARFS evaluates the importance of each feature with the reduction of p-value on the datasets before and after random permutation of that feature, which makes it statistically interpretable. To demonstrate its efficiency, three ARFS models, i.e. ARFS-RF, ARFS-SVM and ARFS-NB were used to compare with some feature selection approaches, i.e. RF-ACC, RF-Gini, KNNpermute, SMFS, ANOVA and SNR. The results showed that ARFS-RF obtained better performances both on the synthetic and benchmark datasets. Further study on chronic gastritis dataset in Traditional Chinese Medicine (TCM) showed that the symptom sets given by ARFS-RF performs substantially better than that of TCM experts with the same size. The symptom ranking list generated by ARFS-RF can offer counselling for the physician to design, select, and interpret the symptoms in chronic gastritis diagnosis. (C) 2014 Elsevier B.V. All rights reserved.
机译:涉及多元交互作用的机器学习方法已成为特征选择的主流。但是,由机器学习方法生成的特征重要性评分在统计学上无法解释,这妨碍了其在医学诊断等实践中的应用。在这项研究中,提出了一种基于算法随机性的特征选择(ARFS)框架,以使用算法随机性测试和机器学习方法相结合的p值来测量特征重要性评分。在ARFS中,使用机器学习算法,例如随机森林(RF),支持向量机(SVM)和朴素贝叶斯分类器(NB)来计算每个属于数据分布的示例的不合格分数,然后计算p值从算法不一致性测试中获得的不合格分数。 ARFS通过对该特征进行随机排列之前和之后的数据集上的p值减小来评估每个特征的重要性,这使其在统计学上可以解释。为了证明其效率,使用了三种ARFS模型(即ARFS-RF,ARFS-SVM和ARFS-NB)与某些特征选择方法(即RF-ACC,RF-Gini,KNNpermute,SMFS,ANOVA和SNR)进行比较。结果表明,ARFS-RF在综合数据集和基准数据集上均获得了更好的性能。对中医慢性胃炎数据集的进一步研究表明,ARFS-RF给出的症状组的表现明显好于相同大小的中医专家。 ARFS-RF生成的症状分级列表可以为医师提供咨询,以设计,选择和解释慢性胃炎诊断中的症状。 (C)2014 Elsevier B.V.保留所有权利。

著录项

  • 来源
    《Neurocomputing》 |2014年第22期|252-264|共13页
  • 作者单位

    Huaqiao Univ, Coll Comp Sci & Technol, Xiamen 361021, Peoples R China;

    Huaqiao Univ, Coll Comp Sci & Technol, Xiamen 361021, Peoples R China;

    Xiamen Univ, Sch Informat Sci & Technol, Xiamen 361005, Peoples R China;

    Xiamen Univ, Sch Informat Sci & Technol, Xiamen 361005, Peoples R China;

    Xiamen Univ, Sch Informat Sci & Technol, Xiamen 361005, Peoples R China;

    China Acad Chinese Med Sci, Inst Informat Tradit Chinese Med, Beijing 100700, Peoples R China;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Feature selection; Feature importance; Algorithmic randomness; Conformal predictor; Chronic gastritis; Random forests;

    机译:特征选择;特征重要性;算法随机性;适形预测因子;慢性胃炎;随机森林;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号