【24h】

Incremental Support Vector Clustering Algorithm

机译:增量支持向量聚类算法

获取原文

摘要

Support Vector Clustering(SVC) algorithm is a extended method to unsupervised learning problem from Support Vector machine that has mainly applied in supervised learning problem such as classification and regression. According to kernel mapping, data points are mapped from data space to a high dimensional feature space. Then after the sphere analysis in feature space, SVC finds Support Vectors(SVs) that describe the cluster boundaries. Therefore the SVC is a kernel and boundary based cluster analysis. Main advantages of the SVC method are detection of the arbitrary shaped cluster boundary and robustness about the noise without preknowledge about the data distribution. On the other hand, due to the high computation complexity, it needs high cost about learning. As the result if there is the need for the analysis about the general databases and applications which data updates are collected and applied to the database periodically, we choose either disregard new input pattern and keep the previous learned results or disregard previous high cost learned results and learn newly after adding the new data into previous database. In this paper, We maintain the merits of SVC and extend it to the incremental learning for periodically updated database and variety datamining application. The new incoming data process SVC block-by-block learning, and then processed results are combined into the previous SVC results, so that it shows the overall results of data cluster. We demonstrate that the proposed incremental clustering algorithm produces high-quality clusters and identify meaningful patterns with new input data stream.
机译:支持向量聚类(SVC)算法是对支持向量机的非监督学习问题的扩展方法,主要应用于分类和回归等监督学习问题。根据内核映射,数据点从数据空间映射到高维特征空间。然后,在特征空间中进行球面分析之后,SVC会找到描述聚类边界的支持向量(SV)。因此,SVC是基于内核和边界的聚类分析。 SVC方法的主要优点是可以检测任意形状的簇边界,并具有噪声的鲁棒性,而无需事先了解数据分布。另一方面,由于计算复杂度高,因此学习成本高。结果是,如果需要分析有关常规数据库和应用程序,这些应用程序需要定期收集数据更新并将其应用到数据库,则我们可以选择忽略新的输入模式并保留先前学习的结果,或者忽略先前的高成本学习的结果,并且将新数据添加到以前的数据库中后,可以重新学习。在本文中,我们保留了SVC的优点,并将其扩展到用于定期更新数据库和各种数据挖掘应用程序的增量学习中。新的传入数据处理过程是逐块学习SVC,然后将处理后的结果合并到以前的SVC结果中,从而显示了数据聚类的总体结果。我们证明了所提出的增量聚类算法可产生高质量的聚类,并使用新的输入数据流识别有意义的模式。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号