...
首页> 外文期刊>Big Data Analytics >Study on the use of different quality measures within a multi-objective evolutionary algorithm approach for emerging pattern mining in big data environments
【24h】

Study on the use of different quality measures within a multi-objective evolutionary algorithm approach for emerging pattern mining in big data environments

机译:大数据环境中新兴模式挖掘的多目标进化算法方法中不同质量度量的使用研究

获取原文
           

摘要

Background Emerging pattern mining is a data mining task that extracts rules describing discriminative relationships amongst variables. These rules should be understandable for the experts. Comprehensibility of a rule is traditionally determined by several objectives, which can be calculated by different measures. In this way, multi-objective evolutionary algorithms are suitable for this task. Currently, the growing amount of data makes traditional data mining tasks unable to process them in a reasonable time. These huge amounts of data make even more interesting the extraction of rules that can easily describe the underlying phenomena of this big data. So far there is only one algorithm for emerging pattern mining developed based on multi-objective evolutionary algorithms for big data, the BD-EFEP algorithm. The influence of the selection of different quality measures as objectives in the search process is analysed in this paper. Results The results show that the use of the combination based on Jaccard index and false positive rate is the one with the best trade-off for descriptive induction of emerging patterns. Conclusions It is recommended the use of this combination of quality measure as optimisation objectives in future multi-objective evolutionary algorithm developments for emerging pattern mining focused in big data.
机译:背景技术新兴模式挖掘是一项数据挖掘任务,它提取描述变量之间区别关系的规则。这些规则对于专家来说应该是可以理解的。传统上,规则的可理解性由多个目标确定,这些目标可以通过不同的措施来计算。这样,多目标进化算法适用于此任务。当前,越来越多的数据使传统的数据挖掘任务无法在合理的时间内处理它们。这些海量数据使提取规则变得更加有趣,这些规则可以轻松地描述这些大数据的潜在现象。到目前为止,只有一种基于大数据多目标进化算法开发的新兴模式挖掘算法,即BD-EFEP算法。本文分析了在搜索过程中选择不同质量度量作为目标的影响。结果结果表明,基于Jaccard指数和假阳性率的组合是描述新兴模式描述诱导的最佳折衷方案。结论建议在未来针对大数据的新兴模式挖掘的多目标进化算法开发中,将这种质量度量的组合用作优化目标。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号