首页> 外文期刊>Computing and informatics >REPRESENTATIVE POINTS AND CLUSTER ATTRIBUTES BASED INCREMENTAL SEQUENCE CLUSTERING ALGORITHM
【24h】

REPRESENTATIVE POINTS AND CLUSTER ATTRIBUTES BASED INCREMENTAL SEQUENCE CLUSTERING ALGORITHM

机译:基于代表点和聚类属性的增量序列聚类算法

获取原文
获取原文并翻译 | 示例

摘要

In order to improve the execution time and clustering quality of sequence clustering algorithm in large-scale dynamic dataset, a novel algorithm RPCAISC (Representative Points and Cluster Attributes Based Incremental Sequence Clustering) was presented. In this paper, density factor is defined. The primary representative point that has a density factor less than the prescribed threshold will be deleted directly. New representative points can be reselected from nonrepresentative points. Moreover, the representative points of each cluster are modeled using the K-nearest neighbor method. The definition of the relevant degree (RD) between clusters was also proposed. The RD is computed by comprehensively considering the correlations of objects within a cluster and between different clusters. Then, whether the two clusters need to merge is determined. Additionally, the cluster attributes of the initial clustering are retained with this process. By calculating the matching degree between the incremental sequence and the existing cluster attributes, dynamic sequence clustering can be achieved. The theoretic experimental results and analysis prove that RPCAISC has better correct rate of clustering results and execution efficiency.
机译:为了提高大规模动态数据集中序列聚类算法的执行时间和聚类质量,提出了一种新颖的基于代表点和聚类属性的增量序列聚类算法RPCAISC。在本文中,定义了密度因子。密度因子小于规定阈值的主要代表点将被直接删除。可以从非代表点中重新选择新的代表点。此外,使用K最近邻法对每个群集的代表点进行建模。还提出了集群之间相关度(RD)的定义。 RD是通过综合考虑群集内以及不同群集之间的对象的相关性来计算的。然后,确定两个集群是否需要合并。此外,此过程将保留初始群集的群集属性。通过计算增量序列与现有聚类属性之间的匹配度,可以实现动态序列聚类。理论实验结果和分析证明,RPCAISC具有更好的聚类结果正确率和执行效率。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号