首页> 外文会议>WISE 2014 >Keyword Search over Web Documents Based on Earth Mover's Distance
【24h】

Keyword Search over Web Documents Based on Earth Mover's Distance

机译:基于地球移动器的距离,关键字搜索到Web文档

获取原文

摘要

Keyword search is widely used in many practical applications. Unfortunately, most keyword-based search engines compute the similarity distance between two Web documents by only matching the keywords at the same positions in both the query and the document vectors, without considering the impact of the keywords at neighbouring positions. Such approach usually results in incompleteness of search results. In this paper, we exploit the Earth Mover's Distance (EMD) as a distance function, which is more flexible against other distance functions such as Euclidean distance. To overcome the limitation of EMD-based computation complexity, we use the filtering techniques to minimize the total number of actual EMD computations. We further develop a novel lower bound as a new EMD filter for partial matching technique that is suitable for searching Web documents. The experimental results demonstrate the efficiency of EMD-based search with filtering techniques.
机译:关键字搜索广泛用于许多实际应用。不幸的是,基于关键字的搜索引擎通过仅在查询和文档向量中的相同位置匹配的关键字来计算两个Web文档之间的相似度距离,而不考虑关键字在相邻位置的影响。这种方法通常导致搜索结果的不完整性。在本文中,我们利用地球移动器的距离(EMD)作为距离功能,这与其他距离函数(如欧几里德距离)更加灵活。为了克服基于EMD的计算复杂性的限制,我们使用过滤技术来最小化实际EMD计算的总数。我们进一步开发了一种新的下限,作为适合搜索Web文档的部分匹配技术的新型EMD滤波器。实验结果展示了基于EMD的滤波技术的效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号