首页> 外文OA文献 >Random projections as regularizers: learning a linear discriminant ensemble from fewer observations than dimensions
【2h】

Random projections as regularizers: learning a linear discriminant ensemble from fewer observations than dimensions

机译:随机投影作为正则化器:从少于维度的观察值中学习线性判别整体

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

We examine the performance of an ensemble of randomly-projected Fisher Linear Discriminant classifiers, focusing on the case when there are fewer training observations than data dimensions. Our ensemble is learned from a sequence of randomly-projected representations of the original high dimensional data and therefore for this approach data can be collected, stored and processed in such a compressed form. The specific form and simplicity of this ensemble permits a direct and much more detailed analysis than existing generic tools in previous works. In particular, we are able to derive the exact form of the generalization error of our ensemble, conditional on the training set, and based on this we give theoretical guarantees which directly link the performance of the ensemble to that of the corresponding linear discriminant learned in the full data space. To the best of our knowledge these are the first theoretical results to prove such an explicit link for any classifier and classifier ensemble pair. Furthermore we show that the randomly-projected ensemble is equivalent to implementing a sophisticated regularization scheme to the linear discriminant learned in the original data space and this prevents overfitting in conditions of small sample size where pseudo-inverse FLD learned in the data space is provably poor.
机译:我们研究了随机投影的Fisher线性判别分类器的整体性能,重点是训练观测数据少于数据维度的情况。我们的集成是从原始高维数据的一系列随机投影表示中学习的,因此,对于这种方法,可以以这种压缩形式收集,存储和处理数据。该集合的特定形式和简单性使得它可以比以前的工作中的现有通用工具进行直接而详尽的分析。特别是,我们能够根据训练集得出集合整体化误差的精确形式,并在此基础上给出理论上的保证,这些保证将集合体的性能直接与学习到的相应线性判别式的性能联系起来。完整的数据空间。据我们所知,这是证明任何分类器和分类器集合对具有如此明确联系的第一个理论结果。此外,我们表明,随机投影的集合等效于对原始数据空间中学习的线性判别方法实施复杂的正则化方案,这可防止在样本量较小的情况下过拟合,而在这种情况下,数据空间中学习的伪逆FLD证明很差。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号