首页> 外文学位 >Extracting trends in high-dimensional datasets.
【24h】

Extracting trends in high-dimensional datasets.

机译:在高维数据集中提取趋势。

获取原文
获取原文并翻译 | 示例

摘要

High-dimensional data analysis is an important research area in today's world, due to the rapid growth in the amount of data collected. To that end, this thesis seeks an information-revealing representation for high-dimensional data distributions that may contain local trends in certain subspaces. Examples are data that have continuous support in simple shapes with identifiable branches. Such data can be represented by a graph that consists of segments of locally fit principal curves or surfaces summarizing each identifiable branch. This thesis describes a new algorithm to find the optimal paths through such a principal graph. The paths are optimal in the sense that they represent the longest smooth trends through the data set, and jointly they cover the data set entirely with minimum overlap. The algorithm is suitable for hypothesizing trends in high-dimensional data, and can assist exploratory data analysis and visualization. Additionally, another algorithm called IRST which identifies Information Rich Subsets of High-Dimensional data and extracts the order based Subspace Trends present in them is also developed in this thesis. The notion of Trends, the implementation details, the complexities and analysis along with results on synthetic and real world sample datasets are described.
机译:由于收集的数据量迅速增长,高维数据分析是当今世界上一个重要的研究领域。为此,本论文寻求针对高维数据分布的信息公开表示,其中可能包含某些子空间中的局部趋势。示例是具有可识别分支的简单形状的连续支持数据。这样的数据可以由一个图表示,该图由局部拟合的主曲线或曲面的片段组成,这些片段概括了每个可识别的分支。本文介绍了一种通过这种主图找到最优路径的新算法。从路径上它们代表数据集中最长的平滑趋势的意义上说,路径是最佳的,并且它们共同以最小的重叠覆盖了数据集。该算法适用于假设高维数据的趋势,并且可以辅助探索性数据分析和可视化。此外,本文还开发了另一种称为IRST的算法,该算法可识别高维数据的信息丰富子集并提取其中存在的基于顺序的子空间趋势。描述了趋势的概念,实现的详细信息,复杂性和分析以及合成样本和现实世界样本数据集的结果。

著录项

  • 作者

    Pokharkar, Snehal.;

  • 作者单位

    Wayne State University.;

  • 授予单位 Wayne State University.;
  • 学科 Computer Science.
  • 学位 M.S.
  • 年度 2009
  • 页码 86 p.
  • 总页数 86
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类 自动化技术、计算机技术;
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号