首页> 外文会议>13th international conference on extending database technology 2010 >Feedback-driven Result Ranking and Query Refinement for Exploring Semi-structured Data Collections
【24h】

Feedback-driven Result Ranking and Query Refinement for Exploring Semi-structured Data Collections

机译:反馈驱动的结果排名和查询细化,以探索半结构化数据集合

获取原文
获取原文并翻译 | 示例

摘要

Feedback process has been used extensively in document-centric applications, such as text retrieval and multimedia retrieval. Recently, there have been efforts to apply feedback to semi-structured XML document collections as well. In this paper, we note that feedback can also be an effective tool for exploring (through result ranking and query refinement) large semi-structured data collections. In particular, in large scale data sharing and curation environments, where the user may not know the structure of the data, queries may initially be overly vague. Given a path query and a set of results identified by the system to this query over the data, we consider two types of feedback: Soft feedback captures the user's preference for some features over the others. Hard feedback, on the other hand, expresses users' assertions regarding whether certain features should be further enforced or, in contrast, are to be avoided. Both soft and hard feedback can be "positive" or "negative". For soft feedback, we develop a probabilistic feature significance measure and describe how to use this for ranking results in the presence of dependencies between the path features. To deal with the hard feedback efficiently (i.e., fast enough for interactive exploration), we present finite automata based query refinement solutions. In particular, we present a novel LazyDFA+ algorithm for managing hard feedback. We also describe optimizations that leverage the inherently iterative nature of the feedback process. We bring together these techniques in AXP, a system for adaptive and exploratory path retrieval. The experimental results show the effectiveness of the proposed techniques.
机译:反馈过程已在以文档为中心的应用程序中广泛使用,例如文本检索和多媒体检索。最近,人们也在努力将反馈应用于半结构化XML文档集合。在本文中,我们注意到反馈也可以是探索(通过结果排名和查询细化)大型半结构化数据集合的有效工具。特别是,在用户可能不知道数据结构的大规模数据共享和管理环境中,查询最初可能过于含糊。给定路径查询和系统针对数据查询所确定的一组结果,我们考虑两种类型的反馈:软反馈捕获用户对某些功能的偏好。另一方面,硬反馈表示用户是否应该进一步实施某些功能,或者应避免使用某些功能。软反馈和硬反馈都可以是“正”或“负”。对于软反馈,我们开发了一种概率特征重要性度量,并描述了如何在路径特征之间存在依赖性的情况下使用该度量对结果进行排名。为了有效地处理硬反馈(即足够快以进行交互式探索),我们提出了基于有限自动机的查询优化解决方案。特别是,我们提出了一种新颖的LazyDFA +算法,用于管理硬反馈。我们还将描述利用反馈过程的固有迭代性质的优化。我们将这些技术整合到AXP中,该系统是一种自适应和探索性路径检索系统。实验结果表明了所提出技术的有效性。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号