PSO based fast K-means algorithm for feature selection from high dimensional medical data set

机译：基于PSO的快速K均值算法从高维医学数据集中选择特征

获取原文

页面导航

摘要
著录项
引文网络
相似文献
相关主题

摘要

Features are the most important entity in any data mining and machine learning applications. They are the backbone of any model. Reliability, efficiency and accuracy of the model depends upon the choice of strong and relevant features. However, feature selection is always a time-consuming and challenging task. In this paper, we have proposed an approach where we combine a clustering technique and a stochastic technique to select effective features from the high dimensional breast cancer data set in quick time. In order to select strong and relevant features, we have used an improved version of K-means algorithm called fast K-means algorithm, which is much faster and more accurate than a general means algorithm. The fast K-means algorithm is embedded in Particle Swarm Optimization (PSO) algorithm to produce better results. The results were validated using various classification techniques and were evaluated on various performance evaluation measures. The results obtained were found to be highly supportive in nature. The feature subset generated using PSO based fast K-means algorithm on KDDcup 2008 data set produced an accuracy of 99.39% and its time complexity was found to be O(log(k)).

机译：功能是任何数据挖掘和机器学习应用程序中最重要的实体。它们是任何模型的骨干。模型的可靠性，效率和准确性取决于强大和相关功能的选择。但是，特征选择始终是一项耗时且具有挑战性的任务。在本文中，我们提出了一种方法，该方法将聚类技术和随机技术结合起来，可以快速从高维乳腺癌数据集中选择有效特征。为了选择强大且相关的功能，我们使用了改进的K-means算法版本，称为快速K-means算法，它比常规的均值算法更快，更准确。快速K均值算法被嵌入到粒子群优化（PSO）算法中，以产生更好的结果。使用各种分类技术对结果进行了验证，并使用各种性能评估手段对其进行了评估。发现获得的结果本质上是高度支持的。在KDDcup 2008数据集上使用基于PSO的快速K均值算法生成的特征子集的准确性为99.39％，其时间复杂度为O（log（k））。

著录项

来源
《International Conference on Intelligent Systems and Control》|2016年|1-6|共6页
会议地点
作者
Doreswamy; M Umme Salma;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类
关键词
Clustering algorithms; Breast cancer; Prediction algorithms; Data mining; Particle swarm optimization;

机译：聚类算法;乳腺癌;预测算法;数据挖掘;粒子群优化;

相似文献

外文文献
中文文献
专利

1. An Efficient Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data [J] . N.Magendiran, J.Jayaranjani International Journal of Innovative Research in Science, Engineering and Technology . 2014,第1期

机译：一种基于快速聚类的高维数据特征子集选择算法
2. An Efficient Fast Clustering-Based Feature Subset Selection Algorithm for High- Dimensional Data [J] . N.Magendiran, J.Jayaranjani International Journal of Innovative Research in Science, Engineering and Technology . 2014,第1期

机译：一种基于快速聚类的高维数据特征子集选择算法
3. A Fast Clustering-Based Feature Subset Selection Algorithm for High-Dimensional Data [J] . Song Qinbao, Ni Jingjie, Wang Guangtao Knowledge and Data Engineering, IEEE Transactions on . 2013,第1期

机译：基于快速聚类的高维数据特征子集选择算法
4. PSO based fast K-means algorithm for feature selection from high dimensional medical data set [C] . Doreswamy, M Umme Salma International Conference on Intelligent Systems and Control . 2016

机译：基于PSO的快速k均值算法，用于高维医学数据集的特征选择
5. Improvement of Relieff Based Feature Selection Algorithms for GWAS Data [D] . Arabnejad Khanouki, Marziyeh. 2020

机译：基于Relieff的特征选择算法改进GWAS数据
6. Feature Selection for High-Dimensional and Imbalanced Biomedical Data Based on Robust Correlation Based Redundancy and Binary Grasshopper Optimization Algorithm [O] . Garba Abdulrauf Sharifai, Zurinahni Zainol 2020

机译：基于鲁棒相关基于冗余和二进制蚱蜢优化算法的高维和非兼容生物医学数据的特征选择
7. A hybrid algorithm based on binary chemical reaction optimization and tabu search for feature selection of high-dimensional biomedical data [O] . Chaokun Yan, Jingjing Ma, Huimin Luo, 2018

机译：一种基于二元化学反应优化的混合算法和禁忌搜索高维生物医学数据的特征选择
8. Rough Set Feature Selection Algorithms for Textual Case-Based Classification. [R] . Gupta, K. M., Aha, D. W., Moore, P. 2006

机译：基于文本案例分类的粗糙集特征选择算法。

PSO based fast K-means algorithm for feature selection from high dimensional medical data set

摘要

著录项

引文网络

相似文献

相关主题

期刊订阅