首页> 外文会议>Brazilian Conference on Intelligent Systems >An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization
【24h】

An Online Pyramidal Embedding Technique for High Dimensional Big Data Visualization

机译:高维大数据可视化的在线金字塔嵌入技术

获取原文

摘要

Visualizing multidimensional Big Data is defying: high dimensionalities hinder or even preclude visual inspections. A means of tackling this issue is to use DR (Dimensionality Reduction) techniques, producing low-dimensional representations of high-dimensional data. Popular DR algorithms (e.g., Principal Component Analysis, t-Distributed Stochastic Neighbor Embedding), albeit helpful, are computationally expensive. Most have O(n~2) or O(n~3) ATC (Asymptotic Time Complexity) and/or calculate pairwise distances of the entire data set, exceeding available memory and rendering Big Data DR time-consuming or impracticable. These issues impede the employment of DR for online learning appliances, where recurrent, cumulative model updates are habitual. The stochastic factor of some approaches similarly obstructs any meaningful inspection on how knowledge is spatially disposed. The recently introduced PCS (Polygonal Coordinate System)—an incremental, geometric-based technique with linear ATC—is compelling; however, its restriction to 2-D embeddings amounts to significant information loss. We propose the Big Data ready, incremental PES (Pyramidal Embedding System), which builds on PCS virtues by additionally generating 3-D embeddings through its pyramid-like interspace, mitigating quality degradation. Visual inspections, as well as pairwise distance based statistical analyses, validate the PES ability to retain structural arrangements when embedding high- and low-dimensional data while retaining flexibility in resources consumption.
机译:可视化多维大数据违反:高维度阻碍或甚至排除视觉检查。解决此问题的手段是使用DR(维数)技术,产生高维数据的低维表示。流行的DR算法(例如,主成分分析,T分布式随机邻居嵌入),尽管有用,但是计算昂贵。大多数(n〜2)或O(n〜3)ATC(渐近时间复杂度)和/或计算整个数据集的成对距离,超过可用的存储器并渲染大数据耗费或不可行的数据。这些问题妨碍了在线学习电器博士的就业,其中经常性,累积模型更新是习惯性的。某些方法的随机因素同样妨碍了对知识在空间上处置的任何有意义的检查。最近引入的PC(多边形坐标系)-AN增量,基于几何技术,线性ATC - 是引人注目的;但是,其对2-D嵌入的限制将达到重大信息损失。我们提出了大数据准备好,增量PES(金字塔嵌入系统),通过通过其金字塔的间隙,减轻质量退化,通过额外地产生3-D嵌入来实现PCS美德。目视检查以及基于成对距离的统计分析,验证了在嵌入高和低维数据时保留结构布置的PES能力,同时保持资源消耗的灵活性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号