首页> 外国专利> Expediting K-means cluster analysis data mining using subsample elimination preprocessing

Expediting K-means cluster analysis data mining using subsample elimination preprocessing

机译：使用子样本消除预处理加快K-means聚类分析数据挖掘

页面导航

摘要
著录项
相似文献

摘要

Improved efficiencies of data mining clustering techniques are provided by preprocessing a sample set of data points taken from a complete data set to provide seeds for centroid calculations of the complete data set. Such seeds are generated by selecting a uniform sample set of data points from a set of multi-dimensional data and then seed values for the cluster determination calculation are determined using a centroid analysis on the sample set of data points. The number of seeds calculated corresponds to a number of data clusters expected in the set of multi-dimensional data points. Seed values are determined using subsample elimination techniques.

机译：通过对从完整数据集获取的数据点样本集进行预处理，从而为完整数据集的质心计算提供种子，可以提高数据挖掘聚类技术的效率。通过从一组多维数据中选择一个统一的数据点样本集来生成此类种子，然后使用质心分析对数据点样本集确定用于聚类确定计算的种子值。计算出的种子数对应于多维数据点集中预期的数据簇数。使用亚样品消除技术确定种子值。

著录项

公开/公告号US8229876B2

专利类型
公开/公告日2012-07-24

原文格式PDF
申请/专利权人 SHOUNAK ROYCHOWDHURY;
展开▼

申请/专利号US20090552011
发明设计人 SHOUNAK ROYCHOWDHURY;
展开▼

申请日2009-09-01
分类号G06N5/02;
国家 US
入库时间 2022-08-21 17:29:02

相似文献

专利
外文文献
中文文献