【24h】

Path Bitmap Indexing for Retrieval of XML Documents

机译:检索XML文档的路径位图索引

获取原文
获取原文并翻译 | 示例

摘要

The path-based indexing methods such as the three-dimensional bitmap indexing have been used for collecting and retrieving the similar XML documents. To do this, the paths become the fundamental unit for constructing index. In case the document structure changes, the path extracted before the change and the one after the change are regarded as totally different ones regardless of the degree of the change. Due to this, the performance of the path-based indexing methods is usually bad in retrieving and clustering the documents which are similar. A novel method which can detect the similar paths is needed for the effective collecting and retrieval of XML documents. In this paper, a new path construction similarity which calculates the similarity between the paths is defined and a path bitmap indexing method is proposed to effectively load and extract the similar paths. The proposed method extracts the representative path from the paths which are similar. The paths are clustered using this, and the XML documents are also clustered using the clustered paths. This solves the problem of existing three-dimensional bitmap indexing. Through the performance evaluation, the proposed method showed better clustering accuracy over existing methods.
机译:诸如三维位图索引之类的基于路径的索引方法已用于收集和检索相似的XML文档。为此,路径成为构建索引的基本单位。如果文档结构发生更改,则无论更改的程度如何,更改之前提取的路径和更改之后提取的路径都是完全不同的。因此,在检索和聚类相似的文档时,基于路径的索引方法的性能通常较差。有效收集和检索XML文档需要一种可以检测相似路径的新颖方法。本文定义了一种计算路径之间相似度的新路径构造相似度,并提出了一种路径位图索引方法来有效加载和提取相似路径。所提出的方法从相似的路径中提取代表路径。路径是使用此集群的,而XML文档也使用集群的路径集群的。这解决了现有的三维位图索引的问题。通过性能评估,提出的方法显示出比现有方法更好的聚类精度。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号