首页> 外文会议>The 2nd International Conference on Information Science and Engineering >XPC: A novel method for retrieving massive smallscale XML documents via path constraints
【24h】

XPC: A novel method for retrieving massive smallscale XML documents via path constraints

机译:XPC:一种通过路径约束检索大量小规模XML文档的新颖方法

获取原文

摘要

This paper proposes a novel method for searching massive and small-scale XML documents via path constraints, referred to as XPC, to overcome drawbacks of conventional approaches. Firstly, we propose employing keywords with simple path constraints to retrieve XML data, which provides a user-friendly way without need of understanding complex knowledge and could express user demand accurately. This paper further proposes a novel method for computing term weight in documents via path constraints, called rtf-idf. It measures the similarity of path constraints by N-Gram and other factors according to the structure of the XML documents. Then we rank the relevant documents by an extension of the vector space model. The experimental results show that XPC indeed outperforms the baseline methods such as VSM in plain text and JuruXML.
机译:本文提出了一种通过路径约束搜索大规模和小规模XML文档的新颖方法,称为XPC,以克服常规方法的缺点。首先,我们提出使用具有简单路径约束的关键字来检索XML数据,这提供了一种用户友好的方式,而无需了解复杂的知识并且可以准确地表达用户的需求。本文还提出了一种通过路径约束计算文档中术语权重的新方法,称为rtf-idf。它根据XML文档的结构来衡量N-Gram和其他因素对路径约束的相似性。然后,我们通过向量空间模型的扩展对相关文档进行排序。实验结果表明,XPC的确优于纯文本和JuruXML等基线方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号