...
首页> 外文期刊>Computational statistics & data analysis >Gene selection and prediction for cancer classification using support vector machines with a reject option
【24h】

Gene selection and prediction for cancer classification using support vector machines with a reject option

机译:使用带有拒绝选项的支持向量机进行癌症分类的基因选择和预测

获取原文
获取原文并翻译 | 示例
   

获取外文期刊封面封底 >>

       

摘要

In cancer classification based on gene expression data, it would be desirable to defer a decision for observations that are difficult to classify. For instance, an observation for which the conditional probability of being cancer is around 1/2 would preferably require more advanced tests rather than an immediate decision. This motivates the use of a classifier with a reject option that reports a warning in cases of observations that are difficult to classify. In this paper, we consider a problem of gene selection with a reject option. Typically, gene expression data comprise of expression levels of several thousands of candidate genes. In such cases, an effective gene selection procedure is necessary to provide a better understanding of the underlying biological system that generates data and to improve prediction performance. We propose a machine learning approach in which we apply the l1 penalty to the SVM with a reject option. This method is referred to as the l1 SVM with a reject option. We develop a novel optimization algorithm for this SVM, which is sufficiently fast and stable to analyze gene expression data. The proposed algorithm realizes an entire solution path with respect to the regularization parameter. Results of numerical studies show that, in comparison with the standard l1 SVM, the proposed method efficiently reduces prediction errors without hampering gene selectivity.
机译:在基于基因表达数据的癌症分类中,希望对难以分类的观察结果做出决定。例如,条件为癌症的概率约为1/2的观察结果最好需要更高级的测试,而不是立即做出决定。这鼓励使用带有拒绝选项的分类器,该选项在观察结果难以分类的情况下报告警告。在本文中,我们考虑带有拒绝选项的基因选择问题。通常,基因表达数据包含数千个候选基因的表达水平。在这种情况下,必须有一个有效的基因选择程序来更好地理解产生数据的潜在生物学系统并提高预测性能。我们提出了一种机器学习方法,其中将l1罚分应用于具有拒绝选项的SVM。此方法称为带有拒绝选项的l1 SVM。我们为此SVM开发了一种新颖的优化算法,该算法足够快且稳定,可以分析基因表达数据。所提出的算法相对于正则化参数实现了完整的求解路径。数值研究结果表明,与标准的11支持向量机相比,该方法有效地减少了预测误差,而又不影响基因的选择性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号