首页> 外文会议>Machine Learning and Applications, 2009. ICMLA '09 >Differentially Expressed Gene Identification Based on Separability Index
【24h】

Differentially Expressed Gene Identification Based on Separability Index

机译:基于可分离性指标的差异表达基因鉴定

获取原文

摘要

The identification of differentially expressed genes is central to microarray data analysis. Presented in this paper is an approach to differentially expressed gene identification based on a Separability Index (SI). Features are selected by identifying the optimal number of top ranking genes which result in maximum class separability. The approach was implemented on a training dataset comprising 400 samples from three types of cancers: colon, breast and lung cancer. The top 4222 genes resulted in a maximum separability of 91%. These genes were then used to classify a testing dataset comprising 250 samples, using a K-nearest neighbour (K-NN) classifier, achieving an accuracy of 92%. This outperformed a K-NN classifier trained on features selected based on p ≪ 1:8311 x 10^-7 (Bonferroni corrected p-value cut-off criterion of p ≪ 0:01), which achieved an accuracy of 89.6%. The performance is attributed to the non-arbitrary nature of the maximum SI selection criterion, which is an inherent property of the data, as opposed to the arbitrary assignment of a p-value cut-off. Hierarchical clustering was used to identify clusters of genes, amongst the 4222 genes, with similar expression patterns for each of the three cancers. These clusters were then examined for functional enrichment and significant biological pathways, which were identified for all three cancer types.
机译:差异表达基因的鉴定对微阵列数据分析至关重要。本文提出的是一种基于可分离性指数(SI)的差异表达基因鉴定方法。通过确定导致最大类别可分离性的最佳排名基因的最佳数量来选择特征。该方法是在训练数据集上实施的,该数据集包含来自三种癌症(结肠癌,乳腺癌和肺癌)的400个样本。前4222个基因的最大可分离性为91%。然后,使用K近邻(K-NN)分类器,将这些基因用于对包含250个样本的测试数据集进行分类,以达到92%的准确性。这胜过对基于p≪ 1:8311 x 10 ^ -7(Bonferroni校正的p值截止值p≪ 0:01的标准)选择的特征进行训练的K-NN分类器,其准确性达到89.6%。该性能归因于最大SI选择标准的非任意性质,这是数据的固有属性,与p值截止值的任意分配相反。分层聚类用于识别4222个基因中的基因簇,这三种癌症中的每一种具有相似的表达模式。然后检查这些簇的功能富集和重要的生物学途径,针对所有三种癌症类型进行鉴定。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号