首页> 外文OA文献 >Studies on dimension reduction and feature spaces :
【2h】

Studies on dimension reduction and feature spaces :

机译:关于降维和特征空间的研究:

摘要

Today's world produces and stores huge amounts of data, which calls for methods that can tackle both growing sizes and growing dimensionalities of data sets. Dimension reduction aims at answering the challenges posed by the latter. Many dimension reduction methods consist of a metric transformation part followed by optimization of a cost function. Several classes of cost functions have been developed and studied, while metrics have received less attention. We promote the view that metrics should be lifted to a more independent role in dimension reduction research. The subject of this work is the interaction of metrics with dimension reduction. The work is built on a series of studies on current topics in dimension reduction and neural network research. Neural networks are used both as a tool and as a target for dimension reduction. When the results of modeling or clustering are represented as a metric, they can be studied using dimension reduction, or they can be used to introduce new properties into a dimension reduction method. We give two examples of such use: visualizing results of hierarchical clustering, and creating supervised variants of existing dimension reduction methods by using a metric that is built on the feature space of a neural network. Combining clustering with dimension reduction results in a novel way for creating space-efficient visualizations, that tell both about hierarchical structure and about distances of clusters. We study feature spaces used in a recently developed neural network architecture called extreme learning machine. We give a novel interpretation for such neural networks, and recognize the need to parameterize extreme learning machines with the variance of network weights. This has practical implications for use of extreme learning machines, since the current practice emphasizes the role of hidden units and ignores the variance. A current trend in the research of deep neural networks is to use cost functions from dimension reduction methods to train the network for supervised dimension reduction. We show that equally good results can be obtained by training a bottlenecked neural network for classification or regression, which is faster than using a dimension reduction cost. We demonstrate that, contrary to the current belief, using sparse distance matrices for creating fast dimension reduction methods is feasible, if a proper balance between short-distance and long-distance entries in the sparse matrix is maintained. This observation opens up a promising research direction, with possibility to use modern dimension reduction methods on much larger data sets than which are manageable today.
机译:当今世界生产和存储大量数据,这要求可以同时解决数据集的不断增长的规模和不断增长的维数的方法。降维旨在应对后者带来的挑战。许多降维方法包括度量转换部分,然后优化成本函数。已经开发和研究了几类成本函数,而度量标准受到的关注较少。我们提倡这样的观点,即在降维研究中应将指标提升为更独立的角色。这项工作的主题是指标与降维的相互作用。这项工作建立在一系列有关降维和神经网络研究的最新研究之上。神经网络既可以用作工具,也可以作为减少尺寸的目标。当将建模或聚类的结果表示为度量时,可以使用降维方法对其进行研究,也可以将其用于将新属性引入降维方法中。我们给出了这种用法的两个示例:可视化层次结构聚类的结果,以及通过使用建立在神经网络特征空间上的度量来创建现有降维方法的监督变体。将聚类与降维相结合,可以以一种新颖的方式来创建节省空间的可视化效果,该可视化可以说明层次结构和聚类的距离。我们研究在最近开发的称为极限学习机的神经网络体系结构中使用的特征空间。我们对这种神经网络给出了新颖的解释,并认识到需要根据网络权重的变化对极限学习机进行参数化。这对使用极限学习机具有实际意义,因为当前的实践强调隐藏单元的作用,而忽略了方差。深度神经网络研究的当前趋势是使用降维方法的成本函数来训练网络进行监督降维。我们表明,通过训练瓶颈神经网络进行分类或回归,可以获得同样好的结果,这比使用降维成本更快。我们证明,与当前的看法相反,如果在稀疏矩阵中的短距离和长距离条目之间保持适当的平衡,则使用稀疏距离矩阵创建快速降维方法是可行的。这一发现为有前景的研究方向开辟了道路,有可能在比如今可管理的更大的数据集上使用现代的降维方法。

著录项

  • 作者

    Parviainen Eli;

  • 作者单位
  • 年度 2011
  • 总页数
  • 原文格式 PDF
  • 正文语种 en
  • 中图分类

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号