首页> 外国专利> MASSIVE CLUSTERING OF DISCRETE DISTRIBUTIONS

MASSIVE CLUSTERING OF DISCRETE DISTRIBUTIONS

机译:离散分布的大规模聚类

摘要

The trend of analyzing big data in artificial intelligence requires more scalable machine learning algorithms, among which clustering is a fundamental and arguably the most widely applied method. To extend the applications of regular vector-based clustering algorithms, the Discrete Distribution (D2) clustering algorithm has been developed for clustering bags of weighted vectors which are well adopted in many emerging machine learning applications. The high computational complexity of D2-clustering limits its impact in solving massive learning problems. Here we present a parallel D2-clustering algorithm with substantially improved scalability. We develop a hierarchical structure for parallel computing in order to achieve a balance between the individual-node computation and the integration process of the algorithm. The parallel algorithm achieves significant speed-up with minor accuracy loss.
机译:人工智能中分析大数据的趋势要求使用更多可扩展的机器学习算法,其中群集是最基本的方法,并且可以说是应用最广泛的方法。为了扩展基于规则矢量的聚类算法的应用,已开发了离散分布(D2)聚类算法,用于对袋装的加权矢量进行聚类,在很多新兴的机器学习应用中都很好地采用了该算法。 D2集群的高计算复杂性限制了它在解决大量学习问题中的影响。在这里,我们提出了一种并行D2群集算法,该算法具有显着提高的可伸缩性。我们为并行计算开发了一种层次结构,以便在单节点计算和算法的集成过程之间取得平衡。并行算法可显着提高速度,同时降低精度。

著录项

  • 公开/公告号US2014143251A1

    专利类型

  • 公开/公告日2014-05-22

    原文格式PDF

  • 申请/专利权人 THE PENN STATE RESEARCH FOUNDATION;

    申请/专利号US201314081525

  • 发明设计人 JIA LI;JAMES Z. WANG;YU ZHANG;

    申请日2013-11-15

  • 分类号G06F17/30;

  • 国家 US

  • 入库时间 2022-08-21 16:08:23

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号