A novel clustering method based on a k-means algorithm to address the complexity for clustering big data has been shown to be fast, scalable and with high accuracy. The method does so by computing only over those attributes of the datasets that are of interest to the analyst. In this study, selection and rejection methods are performed after Principal Component Analysis (PCA) on the dataset to identify the relevant features and their order of significance for clustering. This hybridization process allows identification of the order of relevant features from the Principal Components of the dataset prior to clustering using the novel method. The method was implemented to cluster the Iris dataset and a dataset of Conus shell samples. Results show that the clustering precision using the hybridized method was comparable to the results of the existing novel algorithm yet it remains to be higher compared to using the k-means clustering algorithm.
展开▼