首页> 外文会议>International Workshop of the Initiative for the Evaluation of XML Retrieval >Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach
【24h】

Clustering XML Documents Using Closed Frequent Subtrees: A Structural Similarity Approach

机译:使用闭合频繁子树进行聚类XML文档:结构相似性方法

获取原文

摘要

This paper presents the experimental study conducted over the INEX 2007 Document Mining Challenge corpus employing a frequent subtree-based incremental clustering approach. Using the structural information of the XML documents, the closed frequent subtrees are generated. A matrix is then developed representing the closed frequent subtree distribution in documents. This matrix is used to progressively cluster the XML documents. In spite of the large number of documents in INEX 2007 Wikipedia dataset, the proposed frequent subtree-based incremental clustering approach was successful in clustering the documents.
机译:本文介绍了在2007年Inex 2007年挖掘挑战语料库上进行的实验研究,采用频繁的基于子树的增量聚类方法。使用XML文档的结构信息,生成封闭的频繁子树。然后开发矩阵,其代表文档中的封闭式频繁的子树分布。该矩阵用于逐步群集XML文档。尽管Inex 2007 Wikipedia数据集中的大量文档,所提出的频繁的基于子树的增量聚类方法是成功的群集文件。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号