首页> 外文学位 >Random graphs for structure discovery in high-dimensional data.
【24h】

Random graphs for structure discovery in high-dimensional data.

机译:用于在高维数据中发现结构的随机图。

获取原文
获取原文并翻译 | 示例

摘要

Originally motivated by computational considerations, we demonstrate how computational efficient and scalable graph constructions can be used to encode both statistical and spatial information and address the problems of dimension reduction and structure discovery in high-dimensional data, with provable results.; We discuss the asymptotic behavior of power weighted functionals of minimal Euclidean graphs, proving upper and lower bounds for the respective convergence rates and connecting them to the problem of nonparametric estimation of entropy.; We then extend the convergence results from Euclidean graphs to the setting of data that spans a high-dimensional space but which contain fundamental features that are concentrated on lower-dimensional subsets of this space---curves, surfaces or, more generally, lower-dimensional manifolds. In particular, we have developed a novel geometric probability approach to the problem of estimating intrinsic dimension and entropy of manifold data, based on asymptotic properties of graphs such as Minimal Spanning Trees or k-Nearest Neighbor graphs. Unlike previous solutions to this problem, we are able to prove statistical consistency of the obtained estimators for the wide class of Riemann submanifolds of an Euclidean space. We also propose a graph based dimensionality reduction method aimed at extracting lower dimensional features designed expressly to improve classification tasks, with applications to both supervised and semi-supervised learning problems.; Finally, using neighborhood graphs and the multidimensional scaling principle, we develop a general tool for dimensionality reduction in sensor networks, where communication constraints exist and distributed optimization is required. This tool is illustrated through an application to localization in sensor networks.
机译:最初出于计算方面的考虑,我们演示了如何使用有效的计算和可伸缩的图形构造来对统计信息和空间信息进行编码,并解决高维数据中的降维和结构发现问题,并提供可证明的结果。我们讨论了最小欧几里德图的幂加权泛函的渐近行为,证明了各自收敛速度的上限和下限,并将它们与熵的非参数估计问题联系在一起。然后,我们将欧几里得图的收敛结果扩展到跨越一个高维空间但包含基本特征的数据集,这些特征集中在该空间的低维子集上-曲线,曲面,或更普遍的是,尺寸歧管。特别是,我们基于最小生成树或k最近邻图等图的渐近性质,针对估计流形数据的固有维数和熵的问题开发了一种新颖的几何概率方法。与以前解决该问题的方法不同,我们能够证明所获得估计量在欧几里得空间的宽黎曼子流形上的统计一致性。我们还提出了一种基于图的降维方法,旨在提取专为改善分类任务而设计的低维特征,并将其应用于监督和半监督学习问题。最后,使用邻域图和多维比例缩放原理,我们开发了一种用于降低传感器网络中维数的通用工具,该网络中存在通信约束并且需要进行分布式优化。通过在传感器网络中定位的应用程序说明了该工具。

著录项

  • 作者

    Costa, Jose Antonio O.;

  • 作者单位

    University of Michigan.;

  • 授予单位 University of Michigan.;
  • 学科 Engineering Electronics and Electrical.; Statistics.
  • 学位 Ph.D.
  • 年度 2005
  • 页码 160 p.
  • 总页数 160
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 无线电电子学、电信技术;统计学;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号