首页> 外国专利> Method and apparatus for reducing the computational requirements of K- means data clustering

Method and apparatus for reducing the computational requirements of K- means data clustering

机译：减少k均值数据聚类的计算需求的方法和装置

页面导航

摘要
著录项
相似文献

摘要

The present invention is directed to an improved data clustering method and apparatus for use in data mining operations. The present invention determines the pattern vectors of a k-d tree structure which are closest to a given prototype cluster by pruning prototypes through geometrical constraints, before a k-means process is applied to the prototypes. For each sub-branch in the k-d tree, a candidate set of prototypes is formed from the parent of a child node. The minimum and maximum distances from any point in the child node to any prototype in the candidate set is determined. The smallest of the maximum distances found is compared to the minimum distances of each prototype in the candidate set. Those prototypes with a minimum distance greater than the smallest of the maximum distances are pruned or eliminated. Pruning the number of remote prototypes reduces the number of distance calculations for the k-means process, significantly reducing the overall computation time.

机译：本发明针对用于数据挖掘操作的改进的数据聚类方法和设备。本发明通过在将k均值处理应用于原型之前通过几何约束修剪原型来确定最接近给定原型簇的k-d树结构的模式矢量。对于k-d树中的每个子分支，从子节点的父节点形成一组候选原型。确定从子节点中的任何点到候选集中的任何原型的最小和最大距离。将找到的最大距离中的最小距离与候选集中每个原型的最小距离进行比较。最小距离大于最大距离中最小距离的那些原型将被修剪或消除。修剪远程原型的数量可以减少k均值过程的距离计算数量，从而大大减少了总体计算时间。

著录项

公开/公告号US5983224A

专利类型
公开/公告日1999-11-09

原文格式PDF
申请/专利权人 HITACHI AMERICA LTD.;
展开▼

申请/专利号US19970962470
发明设计人 VINEET SINGH;SANJAY RANKA;KHALED ALSABTI;
展开▼

申请日1997-10-31
分类号G06F17/30;
国家 US
入库时间 2022-08-22 02:06:47

相似文献

专利
外文文献
中文文献