首页> 外文期刊>Integrated Computer-Aided Engineering >Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets
【24h】

Enhancing the scalability of a genetic algorithm to discover quantitative association rules in large-scale datasets

机译:增强遗传算法的可扩展性,以发现大规模数据集中的定量关联规则

获取原文
获取原文并翻译 | 示例
           

摘要

Association rule mining is a well-known methodology to discover significant and apparently hidden relations among attributes in a subspace of instances from datasets. Genetic algorithms have been extensively used to find interesting association rules. However, the rule-matching task of such techniques usually requires high computational and memory requirements. The use of efficient computational techniques has become a task of the utmost importance due to the high volume of generated data nowadays. Hence, this paper aims at improving the scalability of quantitative association rule mining techniques based on genetic algorithms to handle large-scale datasets without quality loss in the results obtained. For this purpose, a new representation of the individuals, new genetic operators and a windowing-based learning scheme are proposed to achieve successfully such challenging task. Specifically, the proposed techniques are integrated into the multi-objective evolutionary algorithm named QARGA-M to assess their performances. Both the standard version and the enhanced one of QARGA-M have been tested in several datasets that present different number of attributes and instances. Furthermore, the proposed methodologies have been integrated into other existing techniques based in genetic algorithms to discover quantitative association rules. The comparative analysis performed shows significant improvements of QARGA-M and other existing genetic algorithms in terms of computational costs without losing quality in the results when the proposed techniques are applied.
机译:关联规则挖掘是一种众所周知的方法,可以从数据集中发现实例子空间中属性之间的明显且明显的隐藏关系。遗传算法已被广泛用于寻找有趣的关联规则。但是,这种技术的规则匹配任务通常需要很高的计算和内存要求。由于当今产生的大量数据,使用高效的计算技术已成为最重要的任务。因此,本文旨在提高基于遗传算法的定量关联规则挖掘技术的可扩展性,以处理大规模数据集而不会在获得的结果中造成质量损失。为此,提出了新的个体代表,新的遗传算子和基于窗口的学习方案,以成功地完成这一具有挑战性的任务。具体来说,将提出的技术集成到名为QARGA-M的多目标进化算法中,以评估其性能。 QARGA-M的标准版本和增强版本均已在具有不同数量的属性和实例的多个数据集中进行了测试。此外,所提出的方法已集成到基于遗传算法的其他现有技术中,以发现定量关联规则。进行的比较分析表明,在应用成本方面,QARGA-M和其他现有遗传算法在计算成本上有显着改进,而不会损失结果的质量。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号