首页> 外文学位 >Geometric Representations of High Dimensional Random Data.
【24h】

Geometric Representations of High Dimensional Random Data.

机译:高维随机数据的几何表示。

获取原文
获取原文并翻译 | 示例

摘要

This thesis introduces geometric representations relevant to the analysis of datasets of random vectors in high dimension. These representations are used to study the behavior of near-neighbor clusters in the dataset, shortest paths through the dataset, and evolution of multivariate probability distributions over the dataset. The results in this thesis have wide applicability to machine learning problems and are illustrated for problems including: spectral clustering; dimensionality reduction; activity recognition; and video indexing and retrieval.;This thesis makes several contributions. The first contribution is the shortest path over random points in a Riemannian manifold. More precisely, we establish complete convergence results of power-weighted shortest path lengths in compact Riemannian manifolds to conformal deformation distances. These shortest path results are used to interpret and extend Coiffman's anisotropic diffusion maps for clustering and dimensionality reduction. The second contribution is statistical manifolds that describe differences between curves evolving over a space of probability measures. A statistical manifold is a space of probability measures induced by the Fisher-Riemann metric. We propose to compare smoothly evolving probability distributions in statistical manifold by the surface area of the region between a pair of curves. The surface area measure is applied to activity classification for human movements. The third contribution proposes a dimensionality reduction and cluster analysis framework that uses a quantum mechanical model. This model leads to a generalization of geometric clustering methods such as k-means and Laplacian eigenmap in which the logical equivalence relation "two points are in the same cluster" is relaxed to a probabilistic equivalence relation.
机译:本文介绍了与高维随机矢量数据集分析相关的几何表示。这些表示用于研究数据集中的近邻群集的行为,通过数据集的最短路径以及数据集上的多元概率分布的演变。本文的结果对机器学习问题具有广泛的适用性,并针对以下问题进行了说明:降维;活动识别;以及视频索引和检索。第一个贡献是在黎曼流形中随机点上的最短路径。更准确地说,我们建立了紧致黎曼流形中幂加权最短路径长度到保形变形距离的完全收敛结果。这些最短路径结果用于解释和扩展Coiffman各向异性扩散图,以进行聚类和降维。第二个贡献是统计流形,它描述了在概率度量空间上演变的曲线之间的差异。统计流形是由Fisher-Riemann度量导出的概率度量空间。我们建议通过一对曲线之间区域的表面积来比较统计流形中的平稳演变概率分布。表面积度量应用于人类运动的活动分类。第三个贡献提出了使用量子力学模型的降维和聚类分析框架。该模型导致了几何聚类方法(如k均值和Laplacian特征图)的推广,其中将逻辑等价关系“两个点在同一类中”放宽到概率等价关系。

著录项

  • 作者

    Hwang, Sung Jin.;

  • 作者单位

    University of Michigan.;

  • 授予单位 University of Michigan.;
  • 学科 Statistics.;Engineering Electronics and Electrical.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 121 p.
  • 总页数 121
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号