首页> 外文期刊>BioMed research international >Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions
【24h】

Informative Gene Selection and Direct Classification of Tumor Based on Chi-Square Test of Pairwise Gene Interactions

机译:基于成对基因相互作用的卡方检验的肿瘤信息性基因选择和直接分类

获取原文
获取外文期刊封面目录资料

摘要

In efforts to discover disease mechanisms and improve clinical diagnosis of tumors, it is useful to mine profiles for informative genes with definite biological meanings and to build robust classifiers with high precision. In this study, we developed a new method for tumor-gene selection, the Chi-square test-based integrated rank gene and direct classifier (χ2-IRG-DC). First, we obtained the weighted integrated rank of gene importance from chi-square tests of single and pairwise gene interactions. Then, we sequentially introduced the ranked genes and removed redundant genes by using leave-one-out cross-validation of the chi-square test-based Direct Classifier (χ2-DC) within the training set to obtain informative genes. Finally, we determined the accuracy of independent test data by utilizing the genes obtained above withχ2-DC. Furthermore, we analyzed the robustness ofχ2-IRG-DC by comparing the generalization performance of different models, the efficiency of different feature-selection methods, and the accuracy of different classifiers. An independent test of ten multiclass tumor gene-expression datasets showed thatχ2-IRG-DC could efficiently control overfitting and had higher generalization performance. The informative genes selected byχ2-IRG-DC could dramatically improve the independent test precision of other classifiers; meanwhile, the informative genes selected by other feature selection methods also had good performance inχ2-DC.
机译:为了发现疾病的机制并改善肿瘤的临床诊断,挖掘具有明确生物学意义的信息基因的概况并建立具有高度精确性的强大分类器非常有用。在这项研究中,我们开发了一种新的肿瘤基因选择方法,即基于卡方检验的整合秩基因和直接分类器(χ2-IRG-DC)。首先,我们从单基因和成对基因相互作用的卡方检验中获得了基因重要性的加权综合等级。然后,我们通过在训练集中使用基于卡方检验的直接分类器(χ2-DC)的留一法交叉验证来依次引入排名的基因并删除冗余基因,以获取有用的基因。最后,我们通过利用以上用2-2-DC获得的基因确定了独立测试数据的准确性。此外,我们通过比较不同模型的泛化性能,不同特征选择方法的效率以及不同分类器的准确性,分析了χ2-IRG-DC的鲁棒性。对十个多类肿瘤基因表达数据集的独立测试表明,χ2-IRG-DC可以有效地控制过度拟合并具有较高的泛化性能。通过χ2-IRG-DC选择的信息基因可以显着提高其他分类器的独立测试精度;同时,其他特征选择方法选择的信息基因在χ2-DC中也表现良好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号