【24h】

Subgroup discovery on Big Data: exhaustive methodologies using Map-Reduce

机译:大数据的子组发现:使用地图减少的详尽方法

获取原文

摘要

Subgroup Discovery is a flexible supervised local pattern mining method whose aim is to discover interesting subgroups with respect to one property of interest. Although many efficient algorithms have been developed in this field, the growing interest in data storage has provoked that the datasets are larger and larger hampering their performance. In this paper, two new algorithms to discover subgroups on Big Data have been proposed. In this regard, the MapReduce paradigm has been considered and in concrete Apache Spark was used to face up the Big Data requirements. The experimental study considers more than 40 high dimensional datasets and a set of efficient algorithms on the subgroup discovery field. Search spaces bigger than 3.3·10~(13) available subgroups are used. The experimental analysis demonstrates that the proposed algorithms obtain excellent results in efficiency, demonstrating the usefulness of using Apache Spark in the field.
机译:亚组发现是一种灵活的监督本地模式挖掘方法,其目的是发现关于感兴趣的一个属性的有趣子组。 虽然在该领域开发了许多有效的算法,但是在数据存储中越来越感兴趣地激发了数据集更大,更大妨碍了它们的性能。 在本文中,已经提出了两个用于在大数据上发现子组的新算法。 在这方面,MapReduce范式已被考虑,并且在混凝土中,使用Apache Spark用于面对大数据要求。 实验研究考虑了40多维数据集和亚组发现场上的一组高效算法。 使用比3.3·10〜(13)的搜索空间使用可用子组。 实验分析表明,所提出的算法的效率优异地获得了优异的结果,展示了在现场中使用Apache火花的有用性。

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号