首页> 外文学位 >Feature Selection Using an Extended Piecewise Linear Orthonormal Floating Search.
【24h】

Feature Selection Using an Extended Piecewise Linear Orthonormal Floating Search.

机译:使用扩展的分段线性正交法浮动搜索进行特征选择。

获取原文
获取原文并翻译 | 示例

摘要

The piecewise linear orthonormal floating search (PLOFS) is a wrapper method for feature selection that uses a piecewise linear network (PLN) to evaluate candidate subsets. PLOFS has difficulty working on high dimensional data due to overfitting and poor clustering in the PLN subset evaluation function (SEF), and high computational complexity. The presence of noise features aggravates these problems.;In order to improve upon the SEF used by PLOFS we mapped the PLN to a SPLN. Then a second order embedded feature selection was used to generate improved distance measure weights. Next, a second order method for positioning center vectors was developed. The distance measure weights and improved center vectors are mapped back to the PLN, resulting in improved performance.;We analyze the behavior of noise and dependent features in OLS and use the results to develop a reliable method of eliminating these useless features, thereby extending PLOFS to problems with larger numbers of features. We augment the data with artificial random features as probes and use piecewise linear sequential forward search to identify the useless features and remove them from the data. A two-stage feature selection method which builds upon the basic PLOFS algorithm has been developed which removes useless features and then generates subsets of different sizes of the remaining features using floating search. The resulting Extended PLOFS (EPLOFS) algorithm helps eliminate the ill-effects of too many useless features in the final piecewise linear model allowing it to be applicable to larger datasets.;We have evaluated EPLOFS and compared its performance to those of several other feature selection methods. In the presence of a large number of noise features, EPLOFS consistently produced the optimal subset with only the useful features and no noise features. Subsets of various sizes produced by EPLOFS often have smaller testing errors compared to subsets of the same size produced by other methods. The presence of dependent features further deteriorated performance of filter methods while the performance of EPLOFS remained largely unaffected.
机译:分段线性正交浮动搜索(PLOFS)是用于特征选择的包装方法,该方法使用分段线性网络(PLN)评估候选子集。由于PLN子集评估函数(SEF)中的过拟合和聚类不佳,PLOFS难以处理高维数据,并且计算复杂性高。噪声特征的存在加剧了这些问题。为了改善PLOFS使用的SEF,我们将PLN映射到SPLN。然后,使用二阶嵌入式特征选择来生成改进的距离度量权重。接下来,开发了一种用于定位中心向量的二阶方法。距离度量权重和改进的中心向量被映射回PLN,从而提高了性能。我们分析了OLS中的噪声和相关特征的行为,并使用结果开发出消除这些无用特征的可靠方法,从而扩展了PLOFS大量功能的问题。我们使用人工随机特征作为探针来扩充数据,并使用分段线性顺序正向搜索来识别无用特征并将其从数据中删除。已经开发了一种基于基本PLOFS算法的两阶段特征选择方法,该方法可以删除无用的特征,然后使用浮动搜索生成其余特征的不同大小的子集。由此产生的扩展PLOFS(EPLOFS)算法有助于消除最终分段线性模型中过多无用特征的不良影响,使其可应用于更大的数据集。;我们评估了EPLOFS并将其性能与其他几个特征选择的性能进行了比较方法。在存在大量噪声特征的情况下,EPLOFS始终生成仅具有有用特征而没有噪声特征的最优子集。与其他方法产生的相同大小的子集相比,EPLOFS产生的各种大小的子集通常具有较小的测试错误。相关特征的存在进一步降低了过滤方法的性能,而EPLOFS的性能在很大程度上未受影响。

著录项

  • 作者

    Rawat, Rohit.;

  • 作者单位

    The University of Texas at Arlington.;

  • 授予单位 The University of Texas at Arlington.;
  • 学科 Electrical engineering.
  • 学位 Ph.D.
  • 年度 2016
  • 页码 129 p.
  • 总页数 129
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号