首页> 外文会议>International conference on future information technology;International conference on multimedia and ubiquitous engineering >A Novel on Altered K-Means Algorithm for Clustering Cost Decrease of Non-labeling Big-Data
【24h】

A Novel on Altered K-Means Algorithm for Clustering Cost Decrease of Non-labeling Big-Data

机译:一种新的非标签大数据聚类成本降低的K均值算法

获取原文

摘要

Machine learning in Big Data is getting the spotlight to retrieve useful knowledge inherent in multi-dimensional information and discover new inherent knowledge in the fields related to the storage and retrieval of massive multi-dimensional information that is newly produced. The machine learning technique can be divided into supervised and unsupervised learning according to whether there is data labeling or not. Unsupervised learning, which is a technique to classify and analyze data with no labeling, is utilized in various ways in the analysis of multi-dimensional Big Data. The present study thus proposed an altered K-means algorithm to analyze the problems with the old one and determine the number of clusters automatically. The study also proposed an approach of optimizing the number of clusters through principal component analysis, a pre-processing process, with the input data for clustering. The performance evaluation results confirm that the CVI of the proposed algorithm was superior to that of the old K-means algorithm in accuracy.
机译:大数据中的机器学习正在吸引人们关注,以检索多维信息中固有的有用知识,并在与新生成的大量多维信息的存储和检索有关的领域中发现新的固有知识。根据是否有数据标记,机器学习技术可以分为有监督学习和无监督学习。无监督学习是一种无标签分类和分析数据的技术,它在多维大数据分析中以各种方式被利用。因此,本研究提出了一种改进的K均值算法,以分析旧算法的问题并自动确定聚类数。该研究还提出了一种通过主成分分析,预处理过程以及输入数据进行聚类来优化聚类数量的方法。性能评估结果表明,该算法的CVI精度优于旧的K-means算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号