首页> 外文期刊>Engineering Economics >Data Evolvement Analysis Based on Topology Self-Adaptive Clustering algorithm
【24h】

Data Evolvement Analysis Based on Topology Self-Adaptive Clustering algorithm

机译:基于拓扑自适应聚类算法的数据演化分析

获取原文
获取外文期刊封面目录资料

摘要

Along with the fast advance of internet technique, internet users have to deal with tremendous data every day. To our common sense, one of the most useful knowledge provided for users is about the transfer of the information reflected by two data sets collected at different time stages. This task aims at exploiting the knowledge such as what information newly appears, what information is antiquated, and what information maintains unchanged. It is formally entitled as data evolvement analysis. Clustering is a good solution to this issue. By analyzing the clustering results formed at different time stages, it is simple to acquire the transfer of the information. Unfortunately, aforementioned plan is impractical, since it needs to perform clustering algorithm once more, every time input data are updated. Obviously, it is time-consuming. Therefore, we need to devise a dynamic clustering algorithm. It automatically adjusts its structure to express this transfer. For this reason, a novel Topology Self-Adaptive Clustering algorithm (abbreviated as TSAC) is proposed in this paper. This algorithm comes from Self Organizing Mapping algorithm (abbreviated as SOM), whereas, it doesn't need to make any assumption about neuron topology beforehand. Besides, when input data are updated, its topology remodeled meanwhile. For further elevating its performance, it imports minimum spanning tree to preserve its topology order, which is never performed by any traditional SOM based topology adaptive algorithm. For clearly measuring the range of the transfer, it partitions data space into several grids, and then calculates the density of each grid to quantify the transfer. Experiment results demonstrate that TSAC can automatically tune its topology along with the change of input data. By this algorithm and in addition to grid structure, the transfer of the information can be legibly visualized.DOI: http://dx.doi.org/10.5755/j01.itc.41.2.974
机译:随着互联网技术的飞速发展,互联网用户每天必须处理大量数据。按照我们的常识,为用户提供的最有用的知识之一是有关在不同时间阶段收集的两个数据集所反映的信息的传递。此任务旨在利用诸如新出现的信息,过时的信息以及保持不变的信息之类的知识。它的正式名称为数据演变分析。群集是解决此问题的好方法。通过分析在不同时间阶段形成的聚类结果,很容易获得信息的转移。不幸的是,上述计划是不切实际的,因为每次输入数据更新时,它都需要再次执行聚类算法。显然,这很耗时。因此,我们需要设计一种动态聚类算法。它会自动调整其结构以表示此传输。为此,本文提出了一种新颖的拓扑自适应聚类算法(简称为TSAC)。该算法来自自组织映射算法(简称为SOM),而无需事先对神经元拓扑进行任何假设。此外,当输入数据更新时,其拓扑结构也会同时进行重构。为了进一步提高其性能,它会导入最小生成树以保留其拓扑顺序,这是任何传统的基于SOM的拓扑自适应算法都无法执行的。为了清楚地衡量传输范围,它将数据空间划分为几个网格,然后计算每个网格的密度以量化传输。实验结果表明,TSAC可以随着输入数据的变化自动调整其拓扑结构。通过这种算法,除了网格结构之外,还可以清晰地看到信息的传递。DOI:http://dx.doi.org/10.5755/j01.itc.41.2.974

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号