A hybridized method for clustering datasets using principal components, selection and rejection methods

机译：一种使用主组件，选择和拒绝方法进行聚类数据集的杂交方法

获取原文

页面导航

摘要
著录项
相似文献
相关主题

摘要

A novel clustering method based on a k-means algorithm to address the complexity for clustering big data has been shown to be fast, scalable and with high accuracy. The method does so by computing only over those attributes of the datasets that are of interest to the analyst. In this study, selection and rejection methods are performed after Principal Component Analysis (PCA) on the dataset to identify the relevant features and their order of significance for clustering. This hybridization process allows identification of the order of relevant features from the Principal Components of the dataset prior to clustering using the novel method. The method was implemented to cluster the Iris dataset and a dataset of Conus shell samples. Results show that the clustering precision using the hybridized method was comparable to the results of the existing novel algorithm yet it remains to be higher compared to using the k-means clustering algorithm.

机译：一种基于K-Mean算法的新型聚类方法，用于解决群集大数据的复杂性，已经显示为快速，可扩展，高精度。该方法仅通过计算分析师感兴趣的数据集的那些属性来计算。在本研究中，在数据集上的主成分分析（PCA）之后进行选择和拒绝方法，以识别相关特征及其对聚类的重要顺序。使用新方法在聚类之前，该杂交过程允许从数据集的主要组件识别相关特征的顺序。实施方法以集群虹膜数据集和康斯壳样本的数据集。结果表明，使用杂交方法的聚类精度与现有的新颖算法的结果相当，但与使用K-means聚类算法相比，它仍然更高。

著录项

来源
《Innovation and Analytics Conference amp;amp;amp; Exhibition》|2019年|various paging|共6页
会议地点
作者
Jozelle Addawe; Lee Javellana;
展开▼
作者单位

展开▼
会议组织
原文格式 PDF
正文语种
中图分类热学计量;
关键词

相似文献

外文文献
中文文献
专利

1. On the Nystrom and Column-Sampling Methods for the Approximate Principal Components Analysis of Large Datasets [J] . Homrighausen Darren, Mcdonald Daniel J. Journal of computational and graphical statistics: A joint publication of American Statistical Association, Institute of Mathematical Statistics, Interface Foundation of North America . 2016,第2期

机译：大数据集近似主成分分析的Nystrom和列采样方法
2. Evaluation of principal component selection methods to form a global prediction model by principal component regression [J] . Xie YL, Kalivas JH Analytica chimica acta . 1997,第1a3期

机译：通过主成分回归评估主成分选择方法以形成全局预测模型
3. CLUSTERING OF HIGH DIMENSIONAL DATASET USING K-MAM (MAX-AVG-MIN) METHOD WITH PRINCIPAL COMPONENT ANALYSIS A HYBRID APPROACH [J] . S.DHANABAL, DR.S.CHANDRAMATHI Journal of Theoretical and Applied Information Technology . 2014,第1期

机译：使用K-MAM（MAX-AVG-MIN）方法具有主成分分析的高维数据集进行混合方法
4. A hybridized method for clustering datasets using principal components, selection and rejection methods [C] . Jozelle Addawe, Lee Javellana Innovation and Analytics Conference amp;amp;amp; Exhibition . 2019

机译：一种使用主组件，选择和拒绝方法进行聚类数据集的杂交方法
5. Non-redundant clustering, principal feature selection and learning methods applied to lung tumor image-guided radiotherapy. [D] . Cui, Ying. 2009

机译：非冗余聚类，主要特征选择和学习方法应用于肺肿瘤图像引导放疗。
6. Application of feature selection methods for automated clustering analysis: a review on synthetic datasets [O] . Aliyu Usman Ahmad, Andrew Starkey -1

机译：特征选择方法在自动聚类分析中的应用：综述综合数据集
7. Application of feature selection methods for automated clustering analysis : a review on synthetic datasets [O] . Ahmad, Aliyu Usman, Starkey, Andrew 2017

机译：特征选择方法在自动聚类分析中的应用：综述综合数据集

A hybridized method for clustering datasets using principal components, selection and rejection methods

摘要

著录项

相似文献

相关主题

期刊订阅