A Visual Analytics Approach for Interactive Document Clustering

EHSAN SHERKAT; EVANGELOS E. MILIOS; ROSANE MINGHIM

首页> 外文期刊>ACM Transactions on Interactive Intelligent Systems >A Visual Analytics Approach for Interactive Document Clustering

【24h】

A Visual Analytics Approach for Interactive Document Clustering

机译：交互式文档聚类的可视化分析方法

获取原文

获取原文并翻译 | 示例

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Document clustering is a necessary step in various analytical and automated activities. When guided by the user, algorithms are tailored to imprint a perspective on the clustering process that reflects the user's understanding of the dataset. More than just allow for customized adjustment of the clusters, a visual analytics approach will provide tools for the user to draw new insights on the collection. While contributing his or her perspective, the user will also acquire a deeper understanding of the data set. To that effect, we propose a novel visual analytics system for interactive document clustering. We built our system on top of clustering algorithms that can adapt to user's feedback. In the proposed system, initial clustering is created based on the user-defined number of clusters and the selected clustering algorithm. A set of coordinated visualizations allow the examination of the dataset and the results of the clustering. The visualization provides the user with the highlights of individual documents and understanding of the evolution of documents over the time period to which they relate. The users then interact with the process by means of changing key-terms that drive the process according to their knowledge of the documents domain. In key-term-based interaction, the user assigns a set of key-terms to each target cluster to guide the clustering algorithm. We have improved that process with a novel algorithm for choosing proper seeds for the clustering. Results demonstrate that not only the system has improved considerably its precision, but also its effectiveness in the document-based decision making. A set of quantitative experiments and a user study have been conducted to show the advantages of the approach for document analytics based on clustering. We performed and reported on the use of the framework in a real decision-making scenario that relates users discussion by email to decision making in improving patient care. Results show that the framework is useful even for more complex data sets such as email conversations.

机译：文档聚类是各种分析和自动化活动中的必要步骤。在用户的指导下，可对算法进行定制，以在聚类过程中添加一个观点，以反映用户对数据集的理解。视觉分析方法不仅允许对集群进行自定义调整，还可以为用户提供工具，使他们可以对集合进行新的洞察。在贡献自己的观点的同时，用户还将获得对数据集的更深刻理解。为此，我们提出了一种新颖的可视化分析系统，用于交互式文档聚类。我们基于可适应用户反馈的聚类算法构建了我们的系统。在提出的系统中，初始聚类是基于用户定义的聚类数量和所选聚类算法创建的。一组协调的可视化允许检查数据集和聚类的结果。可视化为用户提供了各个文档的突出显示，并了解了文档在与之相关的时间段内的演变。然后，用户可以根据他们对文档域的了解，通过更改驱动过程的关键术语来与过程交互。在基于关键术语的交互中，用户向每个目标聚类分配一组关键术语，以指导聚类算法。我们使用一种新颖的算法为聚类选择合适的种子，从而改进了该过程。结果表明，该系统不仅大大提高了其精度，而且还提高了基于文档的决策制定的有效性。已经进行了一组定量实验和一个用户研究，以显示基于聚类的文档分析方法的优势。我们在真实的决策场景中执行并报告了该框架的使用情况，该场景通过电子邮件将用户讨论与改善患者护理的决策联系起来。结果表明，该框架甚至对于更复杂的数据集（例如电子邮件对话）也很有用。

著录项

来源
《ACM Transactions on Interactive Intelligent Systems》 |2020年第1期|6.1-6.33|共33页
作者
EHSAN SHERKAT; EVANGELOS E. MILIOS; ROSANE MINGHIM;
展开▼
作者单位

Dalhousie University Canada;

Universidade de Sao Paulo Brazil;

展开▼
收录信息
原文格式 PDF
正文语种 eng
中图分类
关键词
Interactive document clustering; key-term; visualization; document projection; user study; text; email list; seeding; deterministic;

机译：交互式文档聚类;关键术语可视化文件投影;用户研究;文本;电子邮件清单;播种;确定性的;
入库时间 2022-08-18 04:54:50

相似文献

外文文献
中文文献
专利

1. iVisClustering: An Interactive Visual Document Clustering via Topic Modeling [J] . Hanseung Lee, Jaeyeon Kihm, Jaegul Choo, Computer Graphics Forum: Journal of the European Association for Computer Graphics . 2012,第3aPta3期

机译：iVisClustering：通过主题建模的交互式可视文档聚类
2. Comparative Exploration of Document Collections: a Visual Analytics Approach [J] . D. Oelke, H. Strobelt, C. Rohrdantz, Computer Graphics Forum: Journal of the European Association for Computer Graphics . 2014,第3期

机译：文档收集的比较探索：一种可视化分析方法
3. An interactive web-based geovisual analytics platform for co-clustering spatio-temporal data [J] . Computers & geosciences . 2020,第Apra期

机译：基于交互的基于Web的地理视觉分析平台，用于共同聚集成时空数据
4. Interactive Document Clustering Revisited: A Visual Analytics Approach [C] . Ehsan Sherkat, Seyednaser Nourashrafeddin, Evangelos E. Milios, International Conference on Intelligent User Interfaces . 2018

机译：重新访问互动文档集群：视觉分析方法
5. Art Recording Art: Creating an Interactive Visual Document of Personal Experience [D] . Rubin, Ben 2012

机译：艺术唱片艺术：创建个人经历的交互式视觉文档
6. IBVis: Interactive Visual Analytics for Information Bottleneck Based Trajectory Clustering [O] . Yuejun Guo, Qing Xu, Mateu Sbert 2018

机译：IBVIS：基于信息瓶颈的交互式视觉分析基于轨迹群集
7. iVisClustering: An Interactive Visual Document Clustering via Topic Modeling [O] . S. Bruckner, S. Miksch, H. Pfister, 2013

机译：iVisClustering：通过主题建模的交互式可视文档聚类

A Visual Analytics Approach for Interactive Document Clustering

摘要

著录项

相似文献

相关主题

期刊订阅