首页> 外文学位 >Restricting Supervised Learning: Feature Selection and Feature Space Partition.
【24h】

Restricting Supervised Learning: Feature Selection and Feature Space Partition.

机译:限制监督学习:特征选择和特征空间划分。

获取原文
获取原文并翻译 | 示例

摘要

In this dissertation, we proposed several novel techniques for restricting supervised learning problems with respect to either feature selection or feature space partition. Among different feature selection methods, 1-norm regularization is advocated by many researchers because it incorporates feature selection as part of the learning process. We give special focus here on ranking problems because very little work has been done for ranking using L1 penalty. We present here a 1-norm support vector machine method to simultaneously find a linear ranking function and to perform feature subset selection in ranking problems. Additionally, because ranking is formulated as a classification task when pair-wise data are considered, it increases the computational complexity from linear to quadratic in terms of sample size. We also propose a convex hull reduction method to reduce this impact. The method was tested on one artificial data set and two benchmark real data sets, concrete compressive strength set and Abalone data set. Theoretically, by tuning the trade-off parameter between the 1-norm penalty and the empirical error, any desired size of feature subset could be achieved, but computing the whole solution path in terms of the trade-off parameter is extremely difficult. Therefore, using 1-norm regularization alone may not end up with a feature subset of small size. We propose a recursive feature selection method based on 1-norm regularization which can handle the multi-class setting effectively and efficiently. The selection is performed iteratively. In each iteration, a linear multi-class classifier is trained using 1-norm regularization, which leads to sparse weight vectors, i.e., many feature weights are exactly zero. Those zero-weight features are eliminated in the next iteration. The selection process has a fast rate of convergence. We tested our method on an earthworm microarray data set and the empirical results demonstrate that the selected features (genes) have very competitive discriminative power.;Feature space partition separates a complex learning problem into multiple non-overlapping simple sub-problems. It is normally implemented in a hierarchical fashion. Different from decision tree, a leaf node of this hierarchical structure does not represent a single decision, but represents a region (sub-problem) that is solvable with respect to linear functions or other simple functions. In our work, we incorporate domain knowledge in the feature space partition process. We consider domain information encoded by discrete or categorical attributes. A discrete or categorical attribute provides a natural partition of the problem domain, and hence divides the original problem into several non-overlapping sub-problems. In this sense, the domain information is useful if the partition simplifies the learning task. However it is not trivial to select the discrete or categorical attribute that maximally simplify the learning task. A naive approach exhaustively searches all the possible restructured problems. It is computationally prohibitive when the number of discrete or categorical attributes is large. We describe a metric to rank attributes according to their potential to reduce the uncertainty of a classification task. It is quantified as a conditional entropy achieved using a set of optimal classifiers, each of which is built for a sub-problem defined by the attribute under consideration. To avoid high computational cost, we approximate the solution by the expected minimum conditional entropy with respect to random projections. This approach was tested on three artificial data sets, three cheminformatics data sets, and two leukemia gene expression data sets. Empirical results demonstrate that our method is capable of selecting a proper discrete or categorical attribute to simplify the problem, i.e., the performance of the classifier built for the restructured problem always beats that of the original problem.;Restricting supervised learning is always about building simple learning functions using a limited number of features. Top Selected Pair (TSP) method builds simple classifiers based on very few (for example, two) features with simple arithmetic calculation. However, traditional TSP method only deals with static data. In this dissertation, we propose classification methods for time series data that only depend on a few pairs of features. Based on the different comparison strategies, we developed the following approaches: TSP based on average, TSP based on trend, and TSP based on trend and absolute difference amount. In addition, inspired by the idea of using two features, we propose a time series classification method based on few feature pairs using dynamic time warping and nearest neighbor. (Abstract shortened by UMI.).
机译:本文针对特征选择或特征空间划分提出了几种限制监督学习问题的新技术。在不同的特征选择方法中,许多研究人员提倡1-范数正则化,因为它将特征选择作为学习过程的一部分。我们在这里特别关注排名问题,因为使用L1罚分进行排名的工作很少。我们在这里提出一种1-范数支持向量机方法,以同时找到线性排名函数并在排名问题中执行特征子集选择。另外,由于在考虑成对数据时将排名公式化为分类任务,因此就样本量而言,它会增加从线性到二次方的计算复杂度。我们还提出了一种凸包减少方法,以减少这种影响。该方法在一个人工数据集和两个基准真实数据集(混凝土抗压强度集和鲍鱼数据集)上进行了测试。从理论上讲,通过在1-范数罚金和经验误差之间调整权衡参数,可以实现特征子集的任何所需大小,但是根据权衡参数计算整个求解路径非常困难。因此,仅使用1-范数正则化可能不会以小尺寸的特征子集结束。我们提出一种基于1-范数正则化的递归特征选择方法,该方法可以有效地处理多类设置。选择是迭代执行的。在每次迭代中,使用1-范数正则化训练线性多类分类器,这会导致权向量稀疏,即许多特征权正好为零。这些零权重特征将在下一次迭代中消除。选择过程具有很快的收敛速度。我们在an微阵列数据集上测试了我们的方法,经验结果表明所选的特征(基因)具有非常有竞争力的判别力。特征空间划分将复杂的学习问题分为多个不重叠的简单子问题。它通常以分层方式实现。与决策树不同,此分层结构的叶节点不代表单个决策,而是代表一个相对于线性函数或其他简单函数可求解的区域(子问题)。在我们的工作中,我们将领域知识纳入了特征空间划分过程。我们考虑由离散或分类属性编码的域信息。离散或分类属性提供了问题域的自然划分,因此将原始问题划分为几个非重叠的子问题。从这个意义上讲,如果分区简化了学习任务,则域信息很有用。然而,选择能够最大程度简化学习任务的离散或分类属性并非易事。天真的方法详尽地搜索了所有可能的重组问题。当离散或分类属性的数量很大时,它在计算上是禁止的。我们描述了一种根据属性的潜力对属性进行排序的度量,以减少分类任务的不确定性。它被量化为使用一组最佳分类器获得的条件熵,每个分类器都是针对所考虑属性定义的子问题而构建的。为了避免高昂的计算成本,我们通过关于随机投影的预期最小条件熵来近似求解。在三个人工数据集,三个化学信息学数据集和两个白血病基因表达数据集上测试了该方法。实证结果表明,我们的方法能够选择适当的离散或类别属性来简化问题,即,为重组问题构建的分类器的性能始终胜过原始问题。使用有限数量的功能的学习功能。顶级选择对(TSP)方法基于很少的(例如,两个)特征和简单的算术计算来构建简单的分类器。但是,传统的TSP方法仅处理静态数据。本文提出了仅依赖于几对特征的时间序列数据分类方法。基于不同的比较策略,我们开发了以下方法:基于平均值的TSP,基于趋势的TSP和基于趋势和绝对差异量的TSP。另外,受使用两个特征的想法的启发,我们提出了一种基于少量特征对的时间序列分类方法,使用动态时间规整和最近邻。 (摘要由UMI缩短。)。

著录项

  • 作者

    Nan, Xiaofei.;

  • 作者单位

    The University of Mississippi.;

  • 授予单位 The University of Mississippi.;
  • 学科 Computer Science.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 120 p.
  • 总页数 120
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

  • 入库时间 2022-08-17 11:42:40

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号