首页> 外文期刊>IEEE Transactions on Knowledge and Data Engineering >Semisupervised learning of hierarchical latent trait models for data visualization
【24h】

Semisupervised learning of hierarchical latent trait models for data visualization

机译:用于数据可视化的分层潜在特征模型的半监督学习

获取原文
获取原文并翻译 | 示例
           

摘要

Recently, we have developed the hierarchical generative topographic mapping (HGTM), an interactive method for visualization of large high-dimensional real-valued data sets. We propose a more general visualization system by extending HGTM in three ways, which allows the user to visualize a wider range of data sets and better support the model development process. 1) We integrate HGTM with noise models from the exponential family of distributions. The basic building block is the latent trait model (LTM). This enables us to visualize data of inherently discrete nature, e.g., collections of documents, in a hierarchical manner. 2) We give the user a choice of initializing the child plots of the current plot in either interactive, or automatic mode. In the interactive mode, the user selects "regions of interest", whereas in the automatic mode, an unsupervised minimum message length (MML)-inspired construction of a mixture of LTMs is employed. The unsupervised construction is particularly useful when high-level plots are covered with dense clusters of highly overlapping data projections, making it difficult to use the interactive mode. Such a situation often arises when visualizing large data sets. 3) We derive general formulas for magnification factors in latent trait models. Magnification factors are a useful tool to improve our understanding of the visualization plots, since they can highlight the boundaries between data clusters. We illustrate our approach on a toy example and evaluate it on three more complex real data sets.
机译:最近,我们开发了分层的生成地形图(HGTM),这是一种用于可视化大型高维实值数据集的交互式方法。我们通过三种方式扩展HGTM,提出了一个更通用的可视化系统,该系统允许用户可视化更广泛的数据集并更好地支持模型开发过程。 1)我们将HGTM与来自指数分布族的噪声模型集成在一起。基本构件是潜在特征模型(LTM)。这使我们能够以分层方式可视化固有离散性质的数据,例如文档集合。 2)我们为用户提供了以交互方式或自动方式初始化当前图的子图的选择。在交互模式下,用户选择“感兴趣区域”,而在自动模式下,采用无监督的最小消息长度(MML)启发的LTM混合结构。当高级绘图被高度重叠的数据投影的密集簇覆盖时,使用无交互模式时,无监督构造特别有用。当可视化大型数据集时,经常会出现这种情况。 3)推导了潜在性状模型中放大因子的一般公式。放大倍数是提高我们对可视化图的理解的有用工具,因为它们可以突出显示数据集群之间的边界。我们通过一个玩具示例来说明我们的方法,并在三个更复杂的真实数据集上对其进行评估。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号