【24h】

A Significance-Based Graph Model for Clustering Web Documents

机译:基于重要性的网络文档图模型

获取原文
获取原文并翻译 | 示例

摘要

Traditional document clustering techniques rely on single-term analysis, such as the widely used Vector Space Model. However, recent approaches have emerged that are based on Graph Models and provide a more detailed description of document properties. In this work we present a novel Significance-based Graph Model for Web documents that introduces a sophisticated graph weighting method, based on significance evaluation of graph elements. We also define an associated similarity measure based on the maximum common subgraph between the graphs of the corresponding web documents. Experimental results on artificial and real document collections using well-known clustering algorithms indicate the effectiveness of the proposed approach.
机译:传统的文档聚类技术依赖于单项分析,例如广泛使用的向量空间模型。但是,最近出现了一些基于图模型的方法,这些方法提供了文档属性的更详细描述。在这项工作中,我们提出了一个新颖的基于Web的基于重要性的图形模型,该模型基于图形元素的重要性评估引入了一种复杂的图形加权方法。我们还基于相应Web文档的图之间的最大公共子图来定义关联的相似性度量。使用众所周知的聚类算法在人工和真实文档集合上的实验结果表明了该方法的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号