...
首页> 外文期刊>RSC Advances >Chi-MIC-share: a new feature selection algorithm for quantitative structure-activity relationship models
【24h】

Chi-MIC-share: a new feature selection algorithm for quantitative structure-activity relationship models

机译:Chi-MIC共享:用于定量结构 - 活动关系模型的新特征选择算法

获取原文
获取原文并翻译 | 示例

摘要

Quantitative structure-activity relationship models are used in toxicology to predict the effects of organic compounds on aquatic organisms. Common filter feature selection methods use correlation statistics to rank features, but this approach considers only the correlation between a single feature and the response variable and does not take into account feature redundancy. Although the minimal redundancy maximal relevance approach considers the redundancy among features, direct removal of the redundant features may result in loss of prediction accuracy, and cross-validation of training sets to select an optimal subset of features is time-consuming. In this paper, we describe the development of a feature selection method, Chi-MIC-share, which can terminate feature selection automatically and is based on an improved maximal information coefficient and a redundant allocation strategy. We validated Chi-MIC-share using three environmental toxicology datasets and a support vector regression model. The results show that Chi-MIC-share is more accurate than other feature selection methods. We also performed a significance test on the model and analyzed the single-factor effects of the reserved descriptors.
机译:定量结构 - 活性关系模型用于毒理学,以预测有机化合物对水生生物的影响。常见的过滤特征选择方法使用的相关统计数据来排名的功能,但这种方法只考虑单一的功能和响应变量之间的相关性并没有考虑到功能冗余。尽管最小的冗余最大相关性方法考虑了特征之间的冗余,但是直接删除冗余特征可能导致预测精度丢失,并且训练集的交叉验证选择最佳特征子集是耗时。在本文中,我们描述了一种特征选择方法的开发,Chi-MIC共享,其可以自动终止特征选择并且基于改进的最大信息系数和冗余分配策略。我们使用三个环境毒理学数据集和支持向量回归模型验证了Chi-MIC份额。结果表明,CHI-MIC份额比其他特征选择方法更准确。我们还对模型进行了重要性测试,并分析了保留描述符的单因素效应。

著录项

  • 来源
    《RSC Advances 》 |2020年第34期| 共9页
  • 作者单位

    Hunan Agr Univ Hunan Engn &

    Technol Res Ctr Agr Big Data Anal &

    Changsha 410128 Peoples R China;

    Hunan Agr Univ Hunan Engn &

    Technol Res Ctr Agr Big Data Anal &

    Changsha 410128 Peoples R China;

    Hunan Agr Univ Hunan Engn &

    Technol Res Ctr Agr Big Data Anal &

    Changsha 410128 Peoples R China;

    Clemson Univ Sch Comp Clemson SC USA;

    Hunan Agr Univ Hunan Engn &

    Technol Res Ctr Agr Big Data Anal &

    Changsha 410128 Peoples R China;

    Hunan Agr Univ Hunan Engn &

    Technol Res Ctr Agr Big Data Anal &

    Changsha 410128 Peoples R China;

  • 收录信息
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 化学 ;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号