A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm

Liang SUN; Shinichi YOSHIDA; Yanchun LIANG

首页> 外文期刊>IEICE transactions on information and systems >A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm

【24h】

A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm

机译：基于支持向量和K-Means的混合智能数据聚类算法

获取原文

掌桥外文数据库（机构版） >>

开具论文收录证明 >>

页面导航

摘要
著录项
相似文献
相关主题

摘要

Support vector clustering (SVC), a recently developed unsupervised learning algorithm, has been successfully applied to solving many real-life data clustering problems. However, its effectiveness and advantages deteriorate when it is applied to solving complex real-world problems, e.g., those with large proportion of noise data points and with connecting clusters. This paper proposes a support vector and K-Means based hybrid algorithm to improve the performance of SVC. A new SVC training method is developed based on analysis of a Gaussian kernel radius function. An empirical study is conducted to guide better selection of the standard deviation of the Gaussian kernel. In the proposed algorithm, firstly, the outliers which increase problem complexity are identified and removed by training a global SVC. The refined data set is then clustered by a kernel-based K-Means algorithm. Finally, several local SVCs are trained for the clusters and then each removed data point is labeled according to the distance from it to the local SVCs. Since it exploits the advantages of both SVC and K-Means, the proposed algorithm is capable of clustering compact and arbitrary organized data sets and of increasing robustness to outliers and connecting clusters. Experiments are conducted on 2-D data sets generated by mixture models and benchmark data sets taken from the UCI machine learning repository. The cluster error rate is lower than 3.0% for all the selected data sets. The results demonstrate that the proposed algorithm compared favorably with existing SVC algorithms.

机译：支持向量聚类（SVC）是最近开发的一种无监督学习算法，已成功应用于解决许多现实生活中的数据聚类问题。但是，当将其应用于解决复杂的现实世界问题时，例如，具有大量噪声数据点且具有连接簇的问题，其有效性和优势会降低。提出了一种基于支持向量和K-Means的混合算法来提高SVC的性能。基于对高斯核半径函数的分析，开发了一种新的SVC训练方法。进行了一项经验研究，以指导更好地选择高斯核的标准偏差。在提出的算法中，首先，通过训练全局SVC来识别和消除增加问题复杂性的离群值。然后，通过基于内核的K-Means算法对精炼的数据集进行聚类。最后，为集群训练了几个本地SVC，然后根据每个数据点到本地SVC的距离来标记每个删除的数据点。由于它利用了SVC和K-Means的优点，因此该算法能够对紧凑且任意组织的数据集进行聚类，并能够提高对异常值和连接聚类的鲁棒性。对混合模型生成的二维数据集和从UCI机器学习存储库中获取的基准数据集进行了实验。对于所有选定的数据集，群集错误率均低于3.0％。结果表明，该算法与现有的SVC算法相比具有优势。

著录项

来源
《IEICE transactions on information and systems》 |2011年第11期|共10页
作者
Liang SUN; Shinichi YOSHIDA; Yanchun LIANG;
展开▼
作者单位

展开▼
收录信息
原文格式 PDF
正文语种
中图分类无线电电子学、电信技术;
关键词
入库时间 2022-08-18 08:35:29

相似文献

外文文献
中文文献
专利

1. A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm [J] . Liang SUN, Shinichi YOSHIDA, Yanchun LIANG IEICE Transactions on Information and Systems . 2011,第11期

机译：基于支持向量和K-Means的混合智能数据聚类算法
2. Breast cancer diagnosis based on feature extraction using a hybrid of K-means and support vector machine algorithms [J] . Bichen Zheng, Sang Won Yoon, Sarah S. Lam Expert Systems with Application . 2014,第4pta1期

机译：基于特征提取的乳腺癌诊断，结合K均值和支持向量机算法
3. A Novel Hybrid Data Clustering Algorithm Based on Artificial Bee Colony Algorithm and K-Means [J] . Dang Cong Tran, Zhijian Wu, Zelin Wang, Chinese Journal of Electronics . 2015,第4期

机译：基于人工蜂群算法和K-Means的混合数据聚类算法
4. A Novel Support Vector and K-Means based Hybrid Clustering Algorithm [C] . Liang Sun, Shinichi Yoshida, Yanchun Liang Proceedings of the 2010 IEEE International Conference on Information and Automation . 2010

机译：一种基于支持向量和K-Means的新型混合聚类算法
5. The K-MM clustering algorithm based on K-means and K-medoids in data mining. [D] . Li, Yihao. 2011

机译：数据挖掘中基于K-means和K-medoids的K-MM聚类算法。
6. Does Determination of Initial Cluster Centroids Improve the Performance of K-Means Clustering Algorithm? Comparison of Three Hybrid Methods by Genetic Algorithm Minimum Spanning Tree and Hierarchical Clustering in an Applied Study [O] . Saeedeh Pourahmad, Atefeh Basirat, Amir Rahimi, 2020

机译：初始簇质心的确定是否提高了K-Means聚类算法的性能？应用研究中遗传算法最小生成树和分层聚类的三种混合方法的比较
7. Comparative Analysis of Hybrid K-Mean Algorithms on Data Clustering [O] . Navreet Kaur, Shruti Aggarwal 2017

机译：杂交k平均算法对数据聚类的比较分析

A Support Vector and K-Means Based Hybrid Intelligent Data Clustering Algorithm

摘要

著录项

相似文献

相关主题

期刊订阅