首页> 外文期刊>IEICE transactions on information and systems >SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree
【24h】

SRT-Rank: Ranking Keyword Query Results in Relational Databases Using the Strongly Related Tree

机译:SRT-rank:排名关键字查询导致使用强相关的树的关系数据库

获取原文
           

摘要

A top-k keyword query in relational databases returns k trees of tuples — where the tuples containing the query keywords are connected via primary key-foreign key relationships — in the order of relevance to the query. Existing works are classified into two categories: 1) the schema-based approach and 2) the schema-free approach. We focus on the former utilizing database schema information for more effective ranking of the query results. Ranking measures used in existing works can be classified into two categories: 1) the size of the tree (i.e., the syntactic score) and 2) ranking measures, such as TF-IDF, borrowed from the information retrieval field. However, these measures do not take into account semantic relevancy among relations containing the tuples in the query results. In this paper, we propose a new ranking method that ranks the query results by utilizing semantic relevancy among relations containing the tuples at the schema level. First, we propose a structure of semantically strongly related relations, which we call the strongly related tree (SRT ). An SRT is a tree that maximally connects relations based on the lossless join property. Next, we propose a new ranking method, SRT-Rank , that ranks the query results by a new scoring function augmenting existing ones with the concept of the SRT. SRT-Rank is the first research effort that applies semantic relevancy among relations to ranking the results of keyword queries. To show the effectiveness of SRT-Rank, we perform experiments on synthetic and real datasets by augmenting the representative existing methods with SRT-Rank. Experimental results show that, compared with existing methods, SRT-Rank improves performance in terms of four quality measures — the mean normalized discounted cumulative gain (nDCG), the number of queries whose top-1 result is relevant to the query, the mean reciprocal rank, and the mean average precision — by up to 46.9%, 160.0%, 61.7%, and 63.8%, respectively. In addition, we show that the query performance of SRT-Rank is comparable to or better than those of existing methods.
机译:关系数据库中的顶部 k关键字查询返回 k树的元组 - 其中包含查询关键字的元组通过主键 - 外键关系连接 - 按照与查询的相关顺序。现有工程分为两类:1)基于模式的方法和2)模式方法。我们专注于使用数据库模式信息,以便更有效地排名查询结果。现有工程中使用的排名措施可以分为两类:1)从信息检索字段借用的树(即句法评分)和2)排名措施(即句法评分)和2)的大小。但是,这些措施在包含查询结果中的元组的关系中,不会考虑语义相关性。在本文中,我们提出了一种新的排名方法,可以通过利用模式级别的元组的关系中的语义相关性来排列查询结果。首先,我们提出了一种语义强烈相关关系的结构,我们称之为强烈相关的树( srt)。 SRT是一棵树,最大地连接基于无损连接属性的关系。接下来,我们提出了一种新的排名方法, srt-and,它通过增强现有的函数增强与srt的概念来排名查询结果。 SRT排名是第一个在关系中应用语义相关性的研究工作,以对关键字查询的结果进行排序。为了展示SRT级别的有效性,我们通过使用SRT级别增强代表现有方法对合成和实时数据集进行实验。实验结果表明,与现有方法相比,SRT排名在四个质量措施方面提高了性能 - 平均归一化的折扣累积增益(NDCG),前1个结果与查询相关的查询数量,平均互惠性等级,平均平均精度高达46.9%,160.0%,61.7%和63.8%。此外,我们表明SRT排名的查询性能与现有方法的查询性能相当或更好。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号