首页> 外文OA文献 >A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets
【2h】

A projection pursuit framework for supervised dimension reduction of high dimensional small sample datasets

机译:高维小样本数据集有监督降维的投影追踪框架

摘要

The analysis and interpretation of datasets with large number of features and few examples has remained as a challenging problem in the scientific community, owing to the difficulties associated with the curse-of-the-dimensionality phenomenon. Projection Pursuit (PP) has shown promise in circum-venting this phenomenon by searching low-dimensional projections of the data where meaningful structures are exposed. However, PP faces computational difficulties in dealing with datasets containing thousands of features (typical in genomics and proteomics) due to the vast quantity of parameters to optimize. In this paper we describe and evaluate a PP framework aimed at relieving such difficulties and thus ease the construction of classifier systems.The framework is a two-stage approach, where the first stage performs a rapid compaction of the data and the second stage implements the PP search using an improved version of the SPP method (Guo et al., 2000, [32]). In an experimental evaluation with eight public microarray datasets we showed that some configurations of the proposed framework can clearly overtake the performance of eight well-established dimension reduction methods in their ability to pack more discriminatory information into fewer dimensions.
机译:由于与维数诅咒现象相关的困难,具有大量特征和少量示例的数据集的分析和解释仍然是科学界的一个难题。投影追踪(PP)通过在暴露了有意义结构的地方搜索数据的低维投影,已显示出规避此现象的希望。但是,由于要优化的大量参数,PP在处理包含数千个特征(通常是基因组学和蛋白质组学)的数据集时面临计算困难。在本文中,我们描述和评估了旨在缓解此类困难并简化分类器系统构建的PP框架。该框架是一种两阶段方法,其中第一阶段执行数据的快速压缩,第二阶段执行数据的快速压缩。使用改进版的SPP方法进行PP搜索(Guo等,2000,[32])。在对八个公共微阵列数据集进行的实验评估中,我们表明,所提出框架的某些配置在将更多歧视性信息打包成更少维度的能力上,可以明显超过八种公认的降维方法的性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号