首页> 外文期刊>Future generation computer systems >An efficient automated incremental density-based algorithm for clustering and classification
【24h】

An efficient automated incremental density-based algorithm for clustering and classification

机译:基于群集和分类的高效自动增量密度算法

获取原文
获取原文并翻译 | 示例

摘要

Data clustering divides the datasets into different groups. Incremental Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a famous density-based clustering technique able to find the clusters of variable sizes and shapes. The quaiity of incremental DBSCAN results has been influenced by two input parameters: MinPts (Minimum Points) and Eps (Epsilon). Therefore, the parameter setting is one of the major problems of incremental DBSCAN. In the present article, an improved incremental DBSCAN accorded to Non-dominated Sorting Genetic Algorithm Ⅱ (NSGA-Ⅱ) has been presented to address the issue. The proposed algorithm adjusts the two parameters (MinPts and Eps) of the incremental DBSCAN via the iteration and the fitness functions to enhance the clustering precision. Moreover, our proposed method introduces suitable fitness functions for both labeled and unlabeled datasets. We have also improved the efficiency of the proposed hybrid algorithm by parallelization of the optimization process. The evaluation of the introduced method has been done through some textual and numerical datasets with different shapes, sizes, and dimensions. According to the experimental results, the proposed algorithm provides better results than Multi-Objective Particle Swarm Optimization (MOPSO) based incremental DBSCAN and a few well-known techniques, particularly regarding the shape and balanced datasets. Also, good speed-up can be reached with a parallel model compared with the serial version of the algorithm.
机译:数据群集将数据集划分为不同的组。基于增量的基于密度的空间聚类具有噪声(DBSCAN)是一种知名的基于密度的聚类技术,能够找到可变尺寸和形状的簇。增量DBSCAN结果的Quaiity受到两个输入参数的影响:Minpts(最低点)和EPS(epsilon)。因此,参数设置是增量DBSCAN的主要问题之一。在本文中,提出了一种改进的增量DBSCAN,依据非主导的分类遗传算法Ⅱ(NSGA-Ⅱ)以解决问题。该算法通过迭代调整增量DBSCAN的两个参数(Minpts和EPS),以增强聚类精度。此外,我们所提出的方法为标记和未标记的数据集引入了合适的健身功能。我们还通过优化过程的并行化提高了提出的混合算法的效率。通过具有不同形状,尺寸和尺寸的一些文本和数值数据集进行了对引入的方法的评估。根据实验结果,所提出的算法提供比基于多目标粒子群优化(MOPSO)的增量DBSCAN和一些众所周知的技术提供更好的结果,特别是关于形状和平衡数据集。此外,与算法的串行版本相比,可以通过并行模型来达到良好的加速。

著录项

  • 来源
    《Future generation computer systems》 |2021年第1期|665-678|共14页
  • 作者单位

    Department of Computer Engineering Science and Research Branch Islamic Azad University Tehran Iran;

    Future Technology Research Center National Yunlin University of Science and Technology 123 University Road Section 3 Douliou Yunlin 64002 Taiwan ROC;

    Institute of Research and Development Duy Tan University Da Nang 550000 Viet Nam Health Management and Economics Research Center Iran University of Medical Sciences Tehran Iran;

    Department of Computer Engineering Science and Research Branch Islamic Azad University Tehran Iran;

    Department of Information Technology University of Human Development Suiaymaniyah Iraq;

  • 收录信息 美国《科学引文索引》(SCI);美国《工程索引》(EI);
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Incremental clustering; DBSCAN; Parameter tuning; NSGA-Ⅱ; Parallel processing;

    机译:增量聚类;DBSCAN;参数调整;NSGA-Ⅱ;并行处理;

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号