首页> 外文期刊>Information Sciences: An International Journal >Sliding-Window Probabilistic Threshold Aggregate Queries on Uncertain Data Streams
【24h】

Sliding-Window Probabilistic Threshold Aggregate Queries on Uncertain Data Streams

机译:滑动窗口概率阈值在不确定数据流上的汇总查询

获取原文
获取原文并翻译 | 示例
           

摘要

Uncertain data streams are ubiquitous in many sensing and networking environments. Probabilistic aggregate query that returns a probability distribution to denote possible answers is extensively used on such streams. In many monitoring applications, it is only necessary to know whether the result distribution exceeds user-defined thresholds. In this paper, we formalize two important query types: sliding-window probabilistic threshold sum query and sliding-window probabilistic threshold count query, which introduce two thresholds (probability and score) into the probability distribution. An intuition solution is to use existing probabilistic aggregate algorithms to obtain the probability distribution and then apply the thresholds to this probability distribution. However, this solution separates the threshold processing from query processing and results in low efficiency. Our work uses Gaussian mixture models to represent uncertain data. Based on the Gaussian properties and probability theory of this model, we design efficient algorithms to answer these queries, which include filtering strategies and exact calculations. Several techniques (e.g., characteristic function, incremental calculation, pruning strategy, and state transition equation) are integrated into the exact calculations to improve time and space efficiency. Experiments on real and synthetic datasets demonstrate that our algorithms outperform existing algorithms. (c) 2020 Elsevier Inc. All rights reserved.
机译:在许多感测和网络环境中,不确定的数据流是普遍的。返回概率分布以表示可能答案的概率聚合查询广泛用于此类流。在许多监视应用程序中,只需要知道结果分布是否超过了用户定义的阈值。在本文中,我们正式化了两个重要的查询类型:滑动窗口概率阈值和查询和滑动窗口概率阈值计数查询,其将两个阈值(概率和得分)引入概率分布。直觉解决方案是使用现有的概率聚合算法来获得概率分布,然后将阈值应用于该概率分布。但是,该解决方案将阈值处理与查询处理分开并导致低效率。我们的作品使用高斯混合模型代表不确定的数据。基于该模型的高斯性质和概率理论,我们设计高效的算法来回答这些查询,包括过滤策略和精确计算。几种技术(例如,特征函数,增量计算,修剪策略和状态转换方程)集成到精确的计算中以提高时间和空间效率。真实和合成数据集的实验表明,我们的算法优于现有算法。 (c)2020 Elsevier Inc.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号