首页> 外文期刊>Expert Systems with Application >Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification
【24h】

Penalized logistic regression with the adaptive LASSO for gene selection in high-dimensional cancer classification

机译:自适应LASSO的惩罚性Logistic回归用于高维癌症分类中的基因选择

获取原文
获取原文并翻译 | 示例
           

摘要

An important application of DNA microarray data is cancer classification. Because of the high-dimensionality problem of microarray data, gene selection approaches are often employed to support the expert systems in diagnostic capability of cancer with high classification accuracy. Penalized logistic regression using the least absolute shrinkage and selection operator (LASSO) is one of the key steps in high-dimensional cancer classification, as gene coefficient estimation and gene selection simultaneously. However, the LASSO has been criticized for being biased in gene selection. The adaptive LASSO (APLR) was originally proposed to overcome the selection bias by assigning a consistent weight to each gene. In high-dimensional data, however, the adaptive LASSO faces practical problems in choosing the type of initial weight. In practice, the LASSO estimator itself has been used as an initial weight. However, this may not be preferable because the LASSO is inconsistent in itself. To address this issue, an alternative initial weight in adaptive penalized logistic regression (CBPLR) is proposed. The effectiveness of the CBPLR is examined on three well-known high-dimensional cancer classification datasets using number of selected genes, area under the curve, and misclassification rate. The experimental results reveal that the proposed CBPLR is quite efficient and feasible for cancer classification. Additionally, the proposed weight is compared with APLR and LASSO and exhibits competitive performance in both classification accuracy and gene selection. The proposed CBPLR has significant impact in penalized logistic regression by selecting fewer genes with high area under the curve and low misclassification rate. Thus, the proposed weight could conceivably be used in other research that implements gene selection in the field of high dimensional cancer classification. (C) 2015 Elsevier Ltd. All rights reserved.
机译:DNA微阵列数据的重要应用是癌症分类。由于微阵列数据的高维性问题,基因选择方法经常被用来以高分类精度来支持专家系统的癌症诊断能力。使用最小绝对收缩和选择算子(LASSO)进行的惩罚逻辑回归是高维癌症分类的关键步骤之一,因为同时进行基因系数估计和基因选择。但是,LASSO因在基因选择上存在偏见而受到批评。最初提出了自适应LASSO(APLR),以通过为每个基因分配一致的权重来克服选择偏倚。但是,在高维数据中,自适应LASSO在选择初始权重类型时面临实际问题。实际上,LASSO估计器本身已用作初始权重。但是,这可能不是优选的,因为LASSO本身是不一致的。为了解决这个问题,提出了自适应惩罚逻辑回归(CBPLR)中的替代初始权重。使用选定的基因数量,曲线下的面积和错误分类率,在三个众所周知的高维癌症分类数据集中检查了CBPLR的有效性。实验结果表明,提出的CBPLR对于癌症分类是相当有效和可行的。此外,将拟议的体重与APLR和LASSO进行了比较,在分类准确性和基因选择方面均显示出竞争优势。拟议的CBPLR通过选择较少的曲线下面积大且分类错误率低的基因,对惩罚逻辑回归具有重大影响。因此,可以认为拟议的权重可用于在高维癌症分类领域中进行基因选择的其他研究。 (C)2015 Elsevier Ltd.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号