首页> 外文期刊>IEEE transactions on visualization and computer graphics >TopicLens: Efficient Multi-Level Visual Topic Exploration of Large-Scale Document Collections
【24h】

TopicLens: Efficient Multi-Level Visual Topic Exploration of Large-Scale Document Collections

机译:TopicLens:大型文档集合的高效多级可视主题探索

获取原文
获取原文并翻译 | 示例
       

摘要

Topic modeling, which reveals underlying topics of a document corpus, has been actively adopted in visual analytics for large-scale document collections. However, due to its significant processing time and non-interactive nature, topic modeling has so far not been tightly integrated into a visual analytics workflow. Instead, most such systems are limited to utilizing a fixed, initial set of topics. Motivated by this gap in the literature, we propose a novel interaction technique called TopicLens that allows a user to dynamically explore data through a lens interface where topic modeling and the corresponding 2D embedding are efficiently computed on the fly. To support this interaction in real time while maintaining view consistency, we propose a novel efficient topic modeling method and a semi-supervised 2D embedding algorithm. Our work is based on improving state-of-the-art methods such as nonnegative matrix factorization and t-distributed stochastic neighbor embedding. Furthermore, we have built a web-based visual analytics system integrated with TopicLens. We use this system to measure the performance and the visualization quality of our proposed methods. We provide several scenarios showcasing the capability of TopicLens using real-world datasets.
机译:主题建模可揭示文档语料库的基础主题,已在可视化分析中被积极地用于大型文档集合。但是,由于主题处理时间长且具有非交互性,因此到目前为止,主题建模尚未紧密集成到可视化分析工作流中。相反,大多数此类系统仅限于利用固定的初始主题集。受文献中这种空白的影响,我们提出了一种新颖的交互技术,称为TopicLens,该技术使用户可以通过镜头界面动态浏览数据,在镜头界面中可以快速计算主题建模和相应的2D嵌入。为了在维持视图一致性的同时实时支持此交互,我们提出了一种新颖的有效主题建模方法和一种半监督的2D嵌入算法。我们的工作基于改进的最新方法,例如非负矩阵分解和t分布随机邻居嵌入。此外,我们已经建立了与TopicLens集成的基于Web的视觉分析系统。我们使用该系统来衡量我们提出的方法的性能和可视化质量。我们提供了几种使用真实数据集展示TopicLens功能的方案。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号