【24h】

Relaxed Global Term Weights for XML Element Search

机译:XML元素搜索的轻松全局术语权重

获取原文

摘要

XML element search engines return XML elements which are part of XML documents as search results. Existing studies related to XML element search are brought from the information retrieval techniques for document search. There are some ways to calculate global weights of each term from statistics of XML elements with 1) the same path expression or 2) the same tag. In the first approach, the more complex a path expression is, the less the number of XML elements with the path expression becomes. This is a problem that global term weights may be calculated using statistics of a few XML elements. Such global weights are never global. The second approach also has a problem that it does not consider document structures of XML elements. To resolve the problems, we propose a method for calculating accurate global weights. In our method, we regard a path expression as an array of tags. We relax the restriction of appearance order and appearance frequency of tags in a path expression to gather similar path expressions into the same class. Therefore, we try to decrease the number of classes which hardly contain elements. Our experimental results show that our method can integrate path expressions without decreasing search accuracy with a certain test collection.
机译:XML元素搜索引擎将返回XML文档的一部分作为搜索结果的XML元素。与XML元素搜索相关的现有研究是从信息检索技术带来的文档搜索。有一些方法可以从XML元素的统计数据计算每个术语的全局权重,其中相同的路径表达式或2)相同的标记。在第一种方法中,路径表达式越复杂,路径表达式的XML元素的数量越大。这是一个问题,即全局术语权重可以使用几个XML元素的统计来计算。这种全球权重绝不是全球性的。第二种方法还存在问题,即它不考虑XML元素的文档结构。为了解决问题,我们提出了一种计算精确的全局权重的方法。在我们的方法中,我们将路径表达视为标签数组。我们在路径表达式中放宽对标签的外观顺序和外观频率的限制,以将类似的路径表达式收集到同一类中。因此,我们尝试减少几乎不包含元素的类的数量。我们的实验结果表明,我们的方法可以集成路径表达式,而不会降低与某个测试集合的搜索准确性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号