首页> 外文OA文献 >ClusMAM: fast and effective unsupervised clustering of large complex datasets using metric access methods
【2h】

ClusMAM: fast and effective unsupervised clustering of large complex datasets using metric access methods

机译:ClusMAM:使用度量访问方法对大型复杂数据集进行快速有效的无监督聚类

摘要

An efficient and effective clustering process is a core task of data mining analysis, and has become more important in the nowadays scenario of big data, where scalability is an issue. In this paper we present the ClusMAM method, which proposes a new strategy for clustering large complex datasets through metric access methods. ClusMAM aims at accelerating the process of relational partitional clustering by taking advantage of the inherent node separations of metric access methods. In comparison with other methods from the literature, ClusMAM is up to four orders of magnitude faster than the competitors maintaining clustering quality. Additionally, ClusMAM exploits the datasets to find compact and coherent clusters, suggesting the number of clusters k found in the data. The method was evaluated employing synthetic and real datasets, and the behavior of the method was consistent regarding the number of distance calculations and time required for the clustering process as well.
机译:高效而有效的集群过程是数据挖掘分析的核心任务,并且在当今大数据场景中变得越来越重要,在这种情况下,可伸缩性成为一个问题。在本文中,我们介绍了ClusMAM方法,该方法提出了一种通过度量访问方法对大型复杂数据集进行聚类的新策略。 ClusMAM旨在通过利用度量访问方法的固有节点分隔来加快关系分区聚类的过程。与文献中的其他方法相比,ClusMAM比保持聚类质量的竞争者快四个数量级。此外,ClusMAM利用数据集来查找紧凑且连贯的聚类,建议在数据中找到的聚类数k。该方法使用合成数据集和实际数据集进行了评估,并且在距离计算的数量和聚类过程所需的时间方面,该方法的行为是一致的。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号