首页> 外文期刊>Mathematical Problems in Engineering: Theory, Methods and Applications >FI-FG: Frequent Item Sets Mining from Datasets with High Number of Transactions by Granular Computing and Fuzzy Set Theory
【24h】

FI-FG: Frequent Item Sets Mining from Datasets with High Number of Transactions by Granular Computing and Fuzzy Set Theory

机译:FI-FG:通过粒计算和模糊集理论从频繁交易的数据集中挖掘频繁项目集

获取原文
           

摘要

Mining frequent item set (FI) is an important issue in data mining. Considering the limitations of those exact algorithms and sampling methods, a novel FI mining algorithm based on granular computing and fuzzy set theory (FI-GF) is proposed, which mines those datasets with high number of transactions more efficiently. Firstly, the granularity is applied, which compresses the transactions to some granules for reducing the scanning cost. During the granularity, each granule is represented by a fuzzy set, and the transaction scale represented by a granule is optimized. Then, fuzzy set theory is used to compute the supports of item sets based on those granules, which faces the uncertainty brought by the granularity and ensures the accuracy of the final results. Finally, Apriori is applied to get the FIs based on those granules and the new computing way of supports. Through five datasets, FI-GF is compared with the original Apriori to prove its reliability and efficiency and is compared with a representative progressive sampling way, RC-SS, to prove the advantage of the granularity to the sampling method. Results show that FI-GF not only successfully saves the time cost by scanning transactions but also has the high reliability. Meanwhile, the granularity has advantages to those progressive sampling methods.
机译:挖掘频繁项集(FI)是数据挖掘中的重要问题。考虑到这些精确算法和采样方法的局限性,提出了一种基于粒度计算和模糊集理论(FI-GF)的FI挖掘算法,可以更有效地挖掘交易量较大的数据集。首先,应用粒度,将事务压缩为一些颗粒以降低扫描成本。在粒度过程中,每个颗粒都由模糊集表示,并且优化了颗粒所代表的交易规模。然后,使用模糊集理论基于这些颗粒来计算项目集的支持度,从而克服了粒度带来的不确定性,并确保了最终结果的准确性。最后,将Apriori用于基于这些颗粒和支持物的新计算方式的FI。通过五个数据集,将FI-GF与原始Apriori进行比较以证明其可靠性和效率,并与代表性的渐进式采样方式RC-SS进行比较,以证明粒度相对于采样方法的优势。结果表明,FI-GF不仅可以通过扫描交易成功地节省时间成本,而且具有很高的可靠性。同时,粒度优于那些渐进式采样方法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号