首页> 外文会议>ICIC 2013 >Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams
【24h】

Using a Real-Time Top-k Algorithm to Mine the Most Frequent Items over Multiple Streams

机译:使用实时TOP-K算法将最常用的物品挖出多个流

获取原文

摘要

Some applications such as sensor networks, internet traffic analysis, location-based services, and health measurements are always required for considering unbounded, fast, large-volumes, continuous, even for distributed stream data. It’s a better way to use synopsis as a list of partial summaries of unknown item sets in order to reduce the memory space usage, let it can afford to process so fast and huge incoming data. Normally, different quantity of item set leads to different summaries, especially for Top-k operator which as a partial preprocess over synopsis. Therefore, we proposed smooth synopsis that dynamically assigns a numeral interval to resolve the items set, in order to maintain a more accurate approximate answers’ list from partial Top-k processing. In particular, we proposed an algorithm (called SFI algorithm) to mine the most frequent items by a more adaptive and fast way in specific stream resources. Finally, our experimental results demonstrate the accuracy and efficiency of our approximation techniques.
机译:诸如传感器网络,互联网流量分析,基于位置的服务和健康测量的一些应用程序始终需要考虑无限的,快速,大卷,即使对于分布式流数据。使用Sypopsis作为未知项目集的部分摘要列表是一种更好的方法,以便减少内存空间使用,让它能够提供如此快速和巨大的传入数据。通常,不同数量的项目集导致不同的摘要,特别是对于作为概要的部分预处理的Top-K运算符。因此,我们提出了平滑的概要,它动态地分配数字间隔以解决设置的项目,以便维护从部分顶级k处理中的更准确的近似答案列表。特别是,我们提出了一种算法(称为SFI算法)来通过在特定流资源中通过更自适应和快速的方式挖掘最频繁的项目。最后,我们的实验结果表明了我们近似技术的准确性和效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号