首页> 外文会议>Twenty-First International Workshop on Database and Expert Systems Applications >Scalable Recursive Top-Down Hierarchical Clustering Approach with Implicit Model Selection for Textual Data Sets
【24h】

Scalable Recursive Top-Down Hierarchical Clustering Approach with Implicit Model Selection for Textual Data Sets

机译:隐式模型选择的文本数据集可扩展递归自上而下的层次聚类方法

获取原文

摘要

Automatic generation of taxonomies can be useful for a wide area of applications. In our application scenario a topical hierarchy should be constructed reasonably fast from a large document collection to aid browsing of the data set. The hierarchy should also be used by the InfoSky projection algorithm to create an information landscape visualization suitable for explorative navigation of the data. We developed an algorithm that applies a scalable, recursive, top-down clustering approach to generate a dynamic concept hierarchy. The algorithm recursively applies a workflow consisting of preprocessing, clustering, cluster labeling and projection into 2D space. Besides presenting and discussing the benefits of combining hierarchy browsing with visual exploration, we also investigate the clustering results achieved on a real world data set.
机译:分类法的自动生成对于广泛的应用领域可能很有用。在我们的应用场景中,应该从大型文档集中合理快速地构建主题层次结构,以帮助浏览数据集。 InfoSky投影算法还应使用层次结构来创建适合于数据探索性浏览的信息景观可视化。我们开发了一种算法,该算法应用了可伸缩,递归,自上而下的聚类方法来生成动态概念层次结构。该算法以递归方式应用由预处理,聚类,聚类标记和投影到2D空间组成的工作流。除了展示和讨论将层次结构浏览与可视化探索相结合的好处外,我们还研究了在真实数据集上获得的聚类结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号