首页> 外国专利> Efficient top-K query evaluation on probabilistic data

Efficient top-K query evaluation on probabilistic data

机译：对概率数据的高效top-K查询评估

页面导航

摘要
著录项
相似文献

摘要

A novel approach that computes and efficiently ranks the top-k answers to a query on a probabilistic database. The approach identifies the top-k answers, since imprecisions in the data often lead to a large number of answers of low quality. The algorithm is used to run several Monte Carlo simulations in parallel, one for each candidate answer, and approximates the probability of each only to the extent needed to correctly determine the top-k answers. The algorithm is provably optimal and scales to large databases. A more general application can identify a number of top-rated entities of a group that satisfy a condition, based on a criteria or score computed for the entities. Also disclosed are several optimization techniques. One option is to rank the top-rated results; another option provides for interrupting the iteration to return the number of top-rated entities that have thus far been identified.

机译：一种新颖的方法，可以在概率数据库上计算并有效地对查询的前k个答案进行排序。该方法确定了前k个答案，因为数据中的不精确性通常会导致大量低质量的答案。该算法用于并行运行几个蒙特卡洛模拟，每个候选答案一个，并且仅在正确确定前k个答案所需的范围内近似每个概率。该算法证明是最优的，可以扩展到大型数据库。更通用的应用程序可以基于为实体计算的标准或分数来识别满足条件的一组顶级实体。还公开了几种优化技术。一种选择是对排名最高的结果进行排名;另一种选择是中断迭代以返回到目前为止已被识别的顶级实体的数量。

著录项

公开/公告号US7814113B2

专利类型
公开/公告日2010-10-12

原文格式PDF
申请/专利权人 DAN SUCIU;CHRISTOPHER RE;
展开▼

申请/专利号US20070935230
发明设计人 DAN SUCIU;CHRISTOPHER RE;
展开▼

申请日2007-11-05
分类号G06F7/00;
国家 US
入库时间 2022-08-21 18:52:34

相似文献

专利
外文文献
中文文献