...
首页> 外文期刊>Information Sciences: An International Journal >Dual incremental fuzzy schemes for frequent itemsets discovery in streaming numeric data
【24h】

Dual incremental fuzzy schemes for frequent itemsets discovery in streaming numeric data

机译:流式数字数据中频繁的项目集的双增量模糊方案

获取原文
获取原文并翻译 | 示例
           

摘要

Discovering frequent itemsets is essential for finding association rules, yet too computational expensive using existing algorithms. It is even more challenging to find frequent itemsets upon streaming numeric data. The streaming characteristic leads to a challenge that streaming numeric data cannot be scanned repetitively. The numeric characteristic requires that streaming numeric data should be pre-processed into itemsets, e.g., fuzzy-set methods can transform numeric data into itemsets with non-integer membership values. This leads to a challenge that the frequency of itemsets are usually not integer. To overcome such challenges, fast methods and stream processing methods have been applied. However, the existing algorithms usually either still need to re-visit some previous data multiple times, or cannot count non-integer frequencies. Those existing algorithms re-visiting some previous data have to sacrifice large memory spaces to cache those previous data to avoid repetitive scanning. When dealing with big streaming data nowadays, such large-memory requirement often goes beyond the capacity of many computers. Those existing algorithms unable to count non-integer frequencies would be very inaccurate in estimating the non-integer frequencies of frequent itemsets if used with integer approximation of frequency-counting.
机译:发现频繁的项目集对于查找关联规则至关重要,但使用现有算法昂贵的昂贵昂贵。在流媒体数据数据时发现频繁的项目更具挑战性更具挑战性。流特性导致挑战,即无法重复扫描流数字数据。数字特性要求将流数字数据预处理为项集,例如,模糊集方法可以将数字数据转换为具有非整数隶属值的项目集。这导致挑战,项目集的频率通常不是整数。为了克服这些挑战,已经应用了快速方法和流处理方法。然而,现有算法通常需要多次重新访问一些先前的数据,或者不能计算非整数频率。那些重新访问某些先前数据的现有算法必须牺牲大存储空间以缓存以前的数据以避免重复扫描。如今处理大流数据时,这种大型内存要求通常超出了许多计算机的容量。如果使用与频率计数的整数近似,则无法计算非整数频率的现有算法将非常不准确。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号