首页> 外文会议>IEEE International Congress on Big Data >Revealing the fog-of-war: A visualization-directed, uncertainty-aware approach for exploring high-dimensional data
【24h】

Revealing the fog-of-war: A visualization-directed, uncertainty-aware approach for exploring high-dimensional data

机译:揭示战争迷雾:一种用于可视化的,不确定性感知的方法,用于探索高维数据

获取原文

摘要

Dimensionality Reduction (DR) is a crucial tool to facilitate high-dimensional data analysis. As the volume and the variety of features used to describe a phenomenon keeps increasing, DR has become not only desirable but paramount. However, DR can result in unreliable depictions of data. The uncertainties involved in DR may stem from the selection of methods, parameter configurations, and the constraints imposed by the user. To address these uncertainties, various means of DR quality assessment have been proposed in the literature. Nevertheless, how to optimize the trade-off between the quantification efficiency and accuracy is yet to be further studied. The purpose of this paper is to present a general technique, in the context of visual analytics, to support efficient uncertainty-aware high-dimensional data exploration. We model the uncertainty based on how well neighborhood geometries are preserved during DR. We employ approximated nearest neighbor (ANN) search algorithms to speed up the quantification process with marginal decrease in accuracy. We then visualize the quantified uncertainties in the form of augmented scatter plot. We test our technique with three real world datasets against several well-known DR techniques, and discuss possible underlying causes that lead to certain embedding patterns. Our results show that our approach is effective and beneficial for both DR assessment and user-centered data exploration.
机译:降维(DR)是促进高维数据分析的关键工具。随着用于描述现象的特征的数量和种类不断增加,DR不仅变得很重要,而且变得至关重要。但是,DR可能导致数据描述不可靠。 DR中涉及的不确定性可能源于方法的选择,参数配置以及用户施加的约束。为了解决这些不确定性,文献中提出了多种DR质量评估方法。然而,如何优化量化效率和准确度之间的折衷还有待进一步研究。本文的目的是在视觉分析的背景下提出一种通用技术,以支持有效的不确定性感知的高维数据探索。我们基于灾难恢复期间邻居几何形状的保留程度对不确定性进行建模。我们采用近似最近邻(ANN)搜索算法来加快量化过程,而准确性略有下降。然后,我们以增强的散点图的形式可视化量化的不确定性。我们针对三种众所周知的DR技术,使用三个真实世界的数据集测试了我们的技术,并讨论了导致某些嵌入模式的潜在根本原因。我们的结果表明,我们的方法对于DR评估和以用户为中心的数据探索都是有效且有益的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号