首页> 外文会议>Innovation and Analytics Conference amp;amp;amp; Exhibition >A hybridized method for clustering datasets using principal components, selection and rejection methods
【24h】

A hybridized method for clustering datasets using principal components, selection and rejection methods

机译:一种使用主组件,选择和拒绝方法进行聚类数据集的杂交方法

获取原文

摘要

A novel clustering method based on a k-means algorithm to address the complexity for clustering big data has been shown to be fast, scalable and with high accuracy. The method does so by computing only over those attributes of the datasets that are of interest to the analyst. In this study, selection and rejection methods are performed after Principal Component Analysis (PCA) on the dataset to identify the relevant features and their order of significance for clustering. This hybridization process allows identification of the order of relevant features from the Principal Components of the dataset prior to clustering using the novel method. The method was implemented to cluster the Iris dataset and a dataset of Conus shell samples. Results show that the clustering precision using the hybridized method was comparable to the results of the existing novel algorithm yet it remains to be higher compared to using the k-means clustering algorithm.
机译:一种基于K-Mean算法的新型聚类方法,用于解决群集大数据的复杂性,已经显示为快速,可扩展,高精度。该方法仅通过计算分析师感兴趣的数据集的那些属性来计算。在本研究中,在数据集上的主成分分析(PCA)之后进行选择和拒绝方法,以识别相关特征及其对聚类的重要顺序。使用新方法在聚类之前,该杂交过程允许从数据集的主要组件识别相关特征的顺序。实施方法以集群虹膜数据集和康斯壳样本的数据集。结果表明,使用杂交方法的聚类精度与现有的新颖算法的结果相当,但与使用K-means聚类算法相比,它仍然更高。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号