首页> 外文期刊>Journal of information science and engineering >Adaptive Query Relaxation and Result Categorization Based on Data Distribution and Query Context
【24h】

Adaptive Query Relaxation and Result Categorization Based on Data Distribution and Query Context

机译:基于数据分布和查询上下文的自适应查询松弛和结果分类

获取原文
获取原文并翻译 | 示例
           

摘要

To address the empty and/or many answer problem of Web database query, this paper proposes a general framework to enable automatically query relaxation and result categorization. The framework consists of two processing parts. The first is query relaxation. In this part, each specified attribute is assigned a weight by measuring the query value distribution in the database. The rarely distribution of the query value of the attribute indicates the attribute may important for the user. The original query is then rewritten as a relaxed query by expanding each specified attribute according to its weight. The second part is result categorization. In this step, we first speculate how much the user cares about all attributes (including specified and unspecified attributes) under the query context by using the KL-divergence. Then, the categorizing attribute in each level of the tree can be determined according to its importance for the user. The most important attribute should be the categorizing attribute for the first level of the navigational tree. Lastly, the navigational tree is generated automatically and presented to the user so that the user can easily select the relevant tuples matching his/her needs. Experimental results demonstrated that the query relaxation method can achieve the Precision of 78% and 75% for UsedCarDB (used car dataset) and HouseDB (real estate dataset), respectively, and the result categorization method can also achieve the lowest total and averaged navigational costs than the existing categorization methods.
机译:为了解决Web数据库查询的空白和/或许多答案的问题,本文提出了一个通用框架来实现自动查询松弛和结果分类。该框架包括两个处理部分。首先是查询松弛。在这一部分中,通过测量数据库中的查询值分布为每个指定的属性分配权重。属性的查询值很少分配,表明该属性对用户可能很重要。然后,通过根据其权重扩展每个指定的属性,将原始查询重写为宽松的查询。第二部分是结果分类。在这一步中,我们首先使用KL散度推测用户在查询上下文中关心所有属性(包括指定和未指定的属性)的程度。然后,可以根据树对用户的重要性来确定树的每个级别中的分类属性。最重要的属性应该是导航树第一级的分类属性。最后,导航树是自动生成的,并呈现给用户,以便用户可以轻松选择符合其需求的相关元组。实验结果表明,查询松弛方法可以对二手车数据库(二手车数据集)和房屋数据库(房地产数据集)分别达到78%和75%的精度,并且结果分类方法也可以实现最低的总导航成本和平均导航成本比现有的分类方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号