首页> 外文会议>Pattern recognition in bioinformatics >New Gene Subset Selection Approaches Based on Linear Separating Genes and Gene-Pairs
【24h】

New Gene Subset Selection Approaches Based on Linear Separating Genes and Gene-Pairs

机译:基于线性分离基因和基因对的基因子集选择新方法

获取原文
获取原文并翻译 | 示例

摘要

The concept of linear separability of gene expression data sets with respect to two classes has been recently studied in the literature. The problem is to efficiently find all pairs of genes which induce a linear separation of the data. It has been suggested that an underlying molecular mechanism relates together the two genes of a separating pair to the phenotype under study, such as a specific cancer. In this paper we study the Containment Angle (CA) defined on the unit circle for a linearly separating gene-pair (LS-pair) as an alternative to the paired Mest ranking function for gene selection. Using the CA we also show empirically that a given classifier's error is related to the degree of linear separability of a given data set. Finally we propose gene subset selection methods based on the CA ranking function for LS-pairs and a ranking function for linearly separation genes (LS-genes), and which select only among LS-genes and LS-pairs. Our methods give better results in terms of subset sizes and classification accuracy when compared to a well-performing method, on many data sets.
机译:最近已经在文献中研究了关于两种类别的基因表达数据集的线性可分离性的概念。问题是要有效地找到所有引起数据线性分离的基因对。已经提出,潜在的分子机制将分离对的两个基因与所研究的表型例如特定癌症联系在一起。在本文中,我们研究了线性分离的基因对(LS对)在单位圆上定义的包含角(CA),作为对基因选择的配对Mest排序函数的替代。使用CA,我们还从经验上证明了给定分类器的误差与给定数据集的线性可分离程度有关。最后,我们提出了基于LS对的CA排序功能和线性分离基因(LS基因)的排序功能的基因子集选择方法,该方法仅在LS基因和LS对之间选择。与性能良好的方法相比,在许多数据集上,我们的方法在子集大小和分类准确性方面均能提供更好的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号