为应对未来智能电网海量数据信息带来的实时计算、分析等难题,文章首先在Hadoop云计算平台基础上搭建MapReduce框架,论证了其良好的数据计算性能,并通过实验发现MapReduce在进一步提高计算效率方面的诸多问题———任务调度不均、数据偏移、异构环境下适应性差等。随后考虑MapReduce原始调度方式的弊端并给出均衡数据映射、评估节点性能的MapReduce 架构改进方案,并提出了动态匹配的调度算法( DM-SA———Dynamic Matching Scheduling Algorithm),最后通过在仿真平台上的集群实验,减少了系统计算资源的消耗,缩短了运行时间,显著地提高了集群性能,同时增强了数据本地性,证明了该策略提高MapReduce计算效率的可行性。%In response to the problems of real-time calculation and analysis brought by massive data of future smart grid, we firstly built MapReduce frame based on a Hadoop cloud computing platform, demonstrated its powerful per-formance of data computing, and found many problems by an experiment in further improving computing efficiency for MapReduce, such as uneven scheduling task, offsetting data and the poor adaptability in heterogeneous environments, etc.Then we considered the drawbacks of the original scheduling program and gave an improved program for MapRe-duce architecture of mapping equilibrium data, assessing node performance, and proposed a dynamic matching sched-uling algorithm( DMSA———Dynamic Matching Scheduling Algorithm) .Finally, we reduced the consumption of com-puting resources of the system, shortened the running time, significantly improved the performance of the cluster, and enhanced data locality through the experiment on the simulation platform.We proved the feasibility that this strategy can enhance the computing efficiency of MapReduce.
展开▼