首页> 外文期刊>Expert Systems with Application >A comparative study of nonlinear manifold learning methods for cancer microarray data classification
【24h】

A comparative study of nonlinear manifold learning methods for cancer microarray data classification

机译:非线性流形学习方法用于癌症微阵列数据分类的比较研究

获取原文
获取原文并翻译 | 示例

摘要

The paper presents an empirical comparison of the most prominent nonlinear manifold learning techniques for dimensionality reduction in the context of high-dimensional microarray data classification. In particular, we assessed the performance of six methods: isometric feature mapping, locally linear embedding, Laplacian eigenmaps, Hessian eigenmaps, local tangent space alignment and maximum variance unfolding. Unlike previous studies on the subject, the experimental framework adopted in this work properly extends to dimensionality reduction the supervised learning paradigm, by regarding the test set as an out-of-sample set of new points which are excluded from the manifold learning process. This in order to avoid a possible overestimate of the classification accuracy which may yield misleading comparative results. The different empirical approach requires the use of a fast and effective out-of-sample embedding method for mapping new high-dimensional data points into an existing reduced space. To this aim we propose to apply multi-output kernel ridge regression, an extension of linear ridge regression based on kernel functions which has been recently presented as a powerful method for out-of-sample projection when combined with a variant of isometric feature mapping. Computational experiments on a wide collection of cancer microarray data sets show that classifiers based on Isomap, HE and LE were consistently more accurate than those relying on HE, LTSA and MVU. In particular, under different experimental conditions LLE-based classifier emerged as the most effective method whereas Isomap algorithm turned out to be the second best alternative for dimensionality reduction.
机译:本文介绍了在高维微阵列数据分类的背景下,最杰出的非线性流形学习技术在降维方面的经验比较。特别是,我们评估了六种方法的性能:等距特征映射,局部线性嵌入,拉普拉斯特征图,Hessian特征图,局部切线空间对齐和最大方差展开。与先前关于该主题的研究不同,本工作中采用的实验框架通过将测试集视为新点的样本外集合而适当地扩展到降维监督学习范式,而这些新点被排除在流形学习过程之外。为了避免可能过高估计分类准确性,而分类准确性可能会产生误导性的比较结果。不同的经验方法要求使用快速有效的样本外嵌入方法,以将新的高维数据点映射到现有的缩小空间中。为此,我们建议应用多输出核岭回归,这是基于核函数的线性岭回归的扩展,最近已结合等轴测特征映射的变体,将其作为一种有效的样本外投影方法。在广泛的癌症微阵列数据集上进行的计算实验表明,基于Isomap,HE和LE的分类器始终比依赖HE,LTSA和MVU的分类器更准确。特别是,在不同的实验条件下,基于LLE的分类器成为最有效的方法,而Isomap算法却成为降低维度的第二佳选择。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号