首页> 外文会议>International Workshop on Database Technology and Applications >Clustering Algorithm Based on Semantic Distance for XML Documents
【24h】

Clustering Algorithm Based on Semantic Distance for XML Documents

机译:基于语义距离的聚类算法XML文档

获取原文

摘要

As the information grows exponentially, it has become a new and basic requirement to reduce the querying area efficiently and accurately for information querying. This paper proposes a semantic distance based clustering algorithm for XML documents. It discusses the algorithm in two steps. Firstly, it forms some DTD clusters with all heterogeneous DTD documents by using the global semantic dictionary. Secondly, it computes the semantic distance between XML documents which corresponded certain DTD cluster, then build some finally XML clusters according threshold value given beforehand. Users can locate document cluster and query within this area without extending all over XML documents, and the querying results satisfying the users' requirements can be returned rapidly. The experiments show that this algorithm has good categorization function, and can facilitate information querying.
机译:随着信息呈指数增长,它已成为新的和基本要求,以便有效,准确地为信息查询减少查询区域。本文提出了一种基于语义距离的XML文档聚类算法。它讨论了两个步骤的算法。首先,它通过使用全局语义字典形成一些具有所有异构DTD文档的DTD集群。其次,它计算对应于某些DTD群集的XML文档之间的语义距离,然后根据事先给出的阈值构建一些最终XML群集。用户可以在此区域内找到文档群集和查询而不会在XML文档中扩展,并且可以快速返回满足用户要求的查询结果。实验表明,该算法具有良好的分类功能,可以促进信息查询。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号