首页> 外文会议> >Study on algorithms of parallel and distributed data mining calculating process
【24h】

Study on algorithms of parallel and distributed data mining calculating process

机译:并行和分布式数据挖掘计算过程算法研究

获取原文

摘要

Based on distributed data mining, a kind of parallel and distributed calculating architecture that store partition data information into sub-nodes is introduced by using a thought of partition database and improved Apriori algorithms. It lays emphasis on the data skew in the distributed environment. A converse clustering method is proposed to solve the data skew problem. The corresponding algorithms of parallel and distributed data mining are designed based on the large-scale transaction database. Calculating processes of these algorithms are described in detail. As the parallel and distributed data are processed after effective partition, the transmitted data size is greatly reduced through efficient communication among nodes. The proposed algorithms provide a flexible and extended calculation platform, reduce overhead traffic, and keep a favorable expansibility. The proposed algorithms aim at performing network calculation and finding advantages of network calculation by using a fairly cheap computer. The proposed algorithms can be applied to large parallel or distributed single computer environment.
机译:在分布式数据挖掘的基础上,结合分区数据库思想和改进的Apriori算法,提出了一种将分区数据信息存储到子节点中的并行分布式计算架构。它着重于分布式环境中的数据偏斜。提出了一种逆向聚类方法来解决数据偏斜问题。基于大规模交易数据库,设计了相应的并行和分布式数据挖掘算法。详细描述了这些算法的计算过程。由于并行数据和分布式数据是在有效分区之后进行处理的,因此通过节点之间的有效通信可以大大减少传输数据的大小。所提出的算法提供了灵活且扩展的计算平台,减少了开销流量,并保持了良好的可扩展性。所提出的算法旨在通过使用相当便宜的计算机来执行网络计算并发现网络计算的优势。所提出的算法可以应用于大型并行或分布式单计算机环境。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号