首页> 外文会议> >An agglomerative hierarchical clustering using partial maximum array and incremental similarity computation method
【24h】

An agglomerative hierarchical clustering using partial maximum array and incremental similarity computation method

机译:使用部分最大数组和增量相似度计算方法的聚集层次聚类

获取原文

摘要

As the tractable amount of data grows in the computer science area, fast clustering algorithms are required, because traditional clustering algorithms are not feasible for very large and high-dimensional data. Many studies have been reported on the clustering of large databases, but most of them circumvent this problem by using an approximation method, resulting in the deterioration of accuracy. In this paper, we propose a new clustering algorithm by means of a partial maximum array, which can realize agglomerative hierarchical clustering with the same accuracy as the brute-force algorithm and has O(N/sup 2/) time complexity. We also present an incremental method of similarity computation which substitutes a scalar calculation for the time-consuming calculation of vector similarity. Experimental results show that clustering becomes significantly fast for large and high-dimensional data.
机译:随着计算机科学领域中大量数据的增长,需要快速的聚类算法,因为传统的聚类算法不适用于超大型和高维数据。关于大型数据库集群的研究已有很多报道,但是大多数研究都使用近似方法来规避此问题,从而导致准确性下降。在本文中,我们提出了一种利用局部最大数组的新聚类算法,该算法可以实现与蛮力算法相同的精度的聚结层次聚类,并且具有O(N / sup 2 /)时间复杂度。我们还提出了一种递增的相似度计算方法,该方法将标量计算替换为向量相似度的耗时计算。实验结果表明,对于大型和高维数据,聚类变得非常快。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号