An Increased Performance of Clustering High Dimensional Data Using Principal Component Analysis

机译：使用主成分分析提高高维数据聚类的性能

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

In many application domains such as information retrieval, computational biology, and image processing the data dimension is usually very high. Developing effective clustering methods for high dimensional dataset is a challenging problem due to the curse of dimensionality. The k-means clustering algorithm is used for many practical applications. But it is computationally expensive and the quality of the resulting clusters heavily depends on the selection of initial centroid and dimension of the data. The accuracy of the resultant value perhaps not up to the level of expectation when the dimensions of the dataset is high because we cannot say that the dataset chosen are free from noisy and flawless. So it is required to reduce the dimensionality of the given dataset in order to improve the efficiency and accuracy. This paper proposed a new approach to improve the accuracy of the cluster results by using PCA to determine the initial centroid and also to reduce the dimension of the data.

机译：在许多应用领域中，例如信息检索，计算生物学和图像处理，数据维度通常很高。由于维数的诅咒，为高维数据集开发有效的聚类方法是一个具有挑战性的问题。 k均值聚类算法用于许多实际应用。但这在计算上很昂贵，并且生成的簇的质量在很大程度上取决于初始质心的选择和数据的维数。当数据集的维数很高时，结果值的准确性可能达不到预期的水平，因为我们不能说所选的数据集没有噪音和无瑕疵。因此需要降低给定数据集的维数以提高效率和准确性。本文提出了一种新的方法，通过使用PCA确定初始质心并减小数据的维数来提高聚类结果的准确性。

著录项

来源
《First International Conference on Integrated Intelligent Computing》|2010年|p.17-21|共5页
会议地点
作者
Tajunisha N.; Saravanan V.;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类自动化技术、计算机技术;
关键词
dimension reduction; k-means; principal component analysis;

机译：降维; k-均值;主成分分析;

相似文献

外文文献
中文文献
专利

1. An efficient method to improve the clustering performance for high dimensional data by Principal Component Analysis and modified K-means [J] . Tajunisha, Saravanan International Journal of Database Management Systems . 2011,第1期

机译：通过主成分分析和改进的K均值改进高维数据聚类性能的有效方法
2. Effective Decision Making and Data Visualization Using Partitive Clustering and Principal Component Analysis (PCA) for High Dimensional Pareto Frontier Data [J] . Saket Kansara, Sumeet Parashar, Zhendan Xue SAE International Journal of Materials and Manufacturing . 2015,第2期

机译：使用部分聚类和主成分分析（PCA）对高维Pareto边界数据进行有效的决策和数据可视化
3. Decomposing core energy factor structure of US residential buildings through principal component analysis with variable clustering on high-dimensional mixed data [J] . Wang Endong Applied Energy . 2017,第octa1期

机译：通过对高维混合数据进行变量聚类的主成分分析，分解美国住宅建筑的核心能量因子结构
4. An Increased Performance of Clustering High Dimensional Data Using Principal Component Analysis [C] . Tajunisha N., Saravanan V. International Conference on Integrated Intelligent Computing . 2010

机译：使用主成分分析对聚类高维数据的性能增加
5. Efficient Clustering via Kernel Principal Component Analysis and Optimal One-Dimensional Thresholding [D] . Bhide, Nachiket. 2021

机译：通过内核主体成分分析和最佳一维阈值的高效群集
6. Performance of Principal Component Analysis and Independent Component Analysis with Respect to Signal Extraction from Noisy Positron Emission Tomography Data - a Study on Computer Simulated Images [O] . Pasha Razifar, Hamid Hamed Muhammed, Fredrik Engbrant, 2009

机译：从正电子发射断层图像数据中提取信号的主成分分析和独立成分分析的性能-计算机模拟图像的研究
7. Clustering of Cardiovascular Disease Patients Using Data Mining Techniques with Principal Component Analysis and K-Medoids Clustering of Cardiovascular Disease Patients Using Data Mining Techniques with Principal Component Analysis and K-Medoids [O] . Edy Irwansyah, Ebiet Salim Pratama, Margaretha Ohyver 2020

机译：使用具有主成分分析和K-yemoids的数据挖掘患者使用数据挖掘技术的心血管疾病患者使用数据挖掘技术和K-MEDOIDS患者K-MEDOIDS患者

An Increased Performance of Clustering High Dimensional Data Using Principal Component Analysis

摘要

著录项

相似文献

相关主题

期刊订阅