首页> 中文期刊>计算机科学与探索 >K-split Lasso:有效的肿瘤特征基因选择方法

K-split Lasso:有效的肿瘤特征基因选择方法

     

摘要

With the advent of DNA microarray technology, a large number of open-access tumor gene expression datasets are searchable online and can be downloaded. Informative gene selection and tumor subtype classification have been becoming one of primary research fields in Bioinformatics. This paper proposes K-split Lasso (least absolute shrinkage and selection operator) method for gene selection, whose main idea is to divide the feature sets into K parts, and then select the genes from each feature subset using Lasso, finally merge the selected genes into one feature subset to get the informative genes. Using the support vector machine as classification tool, the experimental results indicate that K-split Lasso reduces data redundancy, improves sample classification accuracy, and has good stability. In addition, K-split Lasso overcomes the large computation and overfitting problems due to the decrease of dimension. K-split Lasso is an effective method for gene selection of tumor.%随着DNA微阵列技术的出现,大量关于不同肿瘤的基因表达谱数据集被发布到网络上,从而使得对肿瘤特征基因选择和亚型分类的研究成为生物信息学领域的热点.基于Lasso (least absolute shrinkage and selection operator)方法提出了K-split Lasso特征选择方法,其基本思想是将数据集平均划分为K份,分别使用Lasso方法对每份进行特征选择,而后将选择出来的每份特征子集合并,重新进行特征选择,得到最终的特征基因.实验采用支持向量机作为分类器,结果表明K-split Lasso方法减少了冗余特征,提高了分类精度,具有良好的稳定性.由于每次计算的维数降低,K-split Lasso方法解决了计算开销过大的问题,并在一定程度上解决了“过拟合”问题.因此K-split Lasso方法是一种有效的肿瘤特征基因选择方法.

著录项

相似文献

  • 中文文献
  • 外文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号