...
首页> 外文期刊>Knowledge-Based Systems >MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme
【24h】

MapReduce based improved quick reduct algorithm with granular refinement using vertical partitioning scheme

机译:基于MapReduce的改进的快速还原算法,使用垂直分区方案进行细化

获取原文
获取原文并翻译 | 示例

摘要

In the last few decades, rough sets have evolved to become an essential technology for feature subset selection by way of reduct computation in categorical decision systems. In recent years with the proliferation of MapReduce for distributed/parallel algorithms, several scalable reduct computation algorithms have been developed in this field for large-scale decision systems using MapReduce. The existing MapReduce based reduct computation approaches use horizontal partitioning (division in object space) of the dataset into the nodes of the cluster, requiring a complicated shuffle and sort phase. In this work, we propose an algorithm MR_IQRA_VP which is designed using vertical partitioning (division in attribute space) of the dataset with a simplified shuffle and sort phase of the MapReduce framework. MR_IQRA_VP is a distributed/parallel implementation of the Improved Quick Reduct Algorithm (IQRA_IG) and is implemented using iterative MapReduce framework of Apache Spark. We have done an extensive comparative study through experimentation on benchmark decision systems using existing horizontal partitioning based reduct computation algorithms. Through experimental analysis, along with theoretical validation, we have established that MR_IQRA_VP is suitable and scalable to datasets of larger size attribute space and moderate object space prevalent in the areas of Bioinformatics and Web mining. (C) 2019 Elsevier B.V. All rights reserved.
机译:在最近的几十年中,粗糙集已经发展成为通过分类决策系统中的约简计算来选择特征子集的一项必不可少的技术。近年来,随着用于分布式/并行算法的MapReduce的激增,在该领域已经为使用MapReduce的大规模决策系统开发了几种可伸缩的缩减计算算法。现有的基于MapReduce的归约计算方法使用数据集的水平分区(在对象空间中划分)到群集的节点中,需要复杂的混洗和排序阶段。在这项工作中,我们提出了一种算法MR_IQRA_VP,该算法使用数据集的垂直分区(属性空间中的划分)设计,并简化了MapReduce框架的混洗和排序阶段。 MR_IQRA_VP是改进的快速减少算法(IQRA_IG)的分布式/并行实现,并且使用Apache Spark的迭代MapReduce框架实现。通过使用基于现有水平划分的归约计算算法的基准决策系统进行实验,我们进行了广泛的比较研究。通过实验分析和理论验证,我们已经确定MR_IQRA_VP适用于生物信息学和Web挖掘领域中普遍存在的较大尺寸属性空间和中等对象空间的数据集,并且可扩展。 (C)2019 Elsevier B.V.保留所有权利。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号