首页> 外文会议>IEEE International Conference on Bioinformatics and Bioengineering >Feature Selection with Non-Linear Dependence Based on Multi-objective Strategy
【24h】

Feature Selection with Non-Linear Dependence Based on Multi-objective Strategy

机译:基于多目标战略的非线性依赖性的特征选择

获取原文

摘要

It is an interesting and important issue to identify a small set of useful features from a high dimensional data that can be used to design a classification mechanism. Usually, researchers prefer to find the features that have high relevance, in the sense that the correlation of each of those features with class labels is high or the mutual information between each of the features and class labels is high. Such approaches usually end up finding features that may be linearly dependent with each other. For some biological studies, it may be interesting to find a set of genes (features), which have high relevance with the class labels and also the genes are nonlinearly dependent -we explicitly want to exclude relevant genes that are linearly correlated among them. Although, our primary focus in this study is to find such genes from microarray data sets, such features may also be important in other studies. In this study, the Combinations of Relevantly Non-linear Dependency Subsets (CoRNDS) is proposed to tackle such the multi-objective problem. It opens up a good to simultaneously control selection of number of useful features, optimize the relevance between the selected features with class labels, and the non-linear dependency between the selected features. Using innovative ways we design three new objectives and optimize them by using the well-known multi-objective evolutionary algorithm based on decomposition (MOEA/D) method. To the best of our knowledge, this is the first attempt to feature (gene) selection along with identification of non-linear dependency between features via a multi-objective strategy. Experimental results show that the feasibility and effective performance on microarray cancer dataset. As to these selected gene subsets, investigate their auxiliary role of co-regulation in the biological pathways, and the occurrence in the pathogenesis of cancer are interesting future works.
机译:它是一种有趣和重要的问题,可以从可用于设计分类机制的高维数据中确定一小组有用的功能。通常,研究人员更愿意找到具有高相关性的特征,从此感觉到与类标签的每个功能的相关性很高或每个特征和类标签之间的相互信息很高。这种方法通常最终找到可以彼此线性地依赖的特征。对于一些生物学研究,找到一组基因(特征)可能是有趣的,其与类标签具有高相关性,并且该基因也是非线性依赖性的 - 我们明确地认为不包括在它们之间线性相关的相关基因。虽然,我们在本研究中的主要焦点是从微阵列数据集找到这些基因,但这些特征在其他研究中也可能是重要的。在本研究中,提出了相关非线性依赖性子集(CORND)的组合来解决这些多目标问题。它为同时控制有用功能数量的选择,优化所选功能与类标签之间的相关性以及所选功能之间的非线性依赖关系。使用创新方式,我们设计三个新的目标并通过使用基于分解(MOEA / D)方法的众所周知的多目标进化算法来优化它们。据我们所知,这是第一次尝试通过多目标战略识别特征之间的非线性依赖性的尝试。实验结果表明,微阵列癌数据集的可行性和有效性能。对于这些选定的基因亚群,研究其共调控在生物途径中的辅助作用,癌症发病机制的发生是有趣的未来作品。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号