首页> 外国专利> Method, apparatus and programmed medium for clustering databases with categorical attributes

Method, apparatus and programmed medium for clustering databases with categorical attributes

机译:用于对具有分类属性的数据库进行聚类的方法,装置和程序介质

摘要

The present invention relates to a computer method, apparatus and programmed medium for clustering databases containing data with categorical attributes. The present invention assigns a pair of points to be neighbors if their similarity exceeds a certain threshold. The similarity value for pairs of points can be based on non-metric information. The present invention determines a total number of links between each cluster and every other cluster bases upon the neighbors of the clusters. A goodness measure between each cluster and every other cluster based upon the total number of links between each cluster and every other cluster and the total number of points within each cluster and every other cluster is then calculated. The present invention merges the two clusters with the best goodness measure. Thus, clustering is performed accurately and efficiently by merging data based on the amount of links between the data to be clustered.
机译:本发明涉及一种用于对包含具有分类属性的数据的数据库进行聚类的计算机方法,装置和编程介质。如果它们的相似性超过某个阈值,则本发明将一对点分配为邻居。点对的相似度值可以基于非度量信息。本发明基于集群的邻居确定每个集群与每隔一个集群之间的链路总数。然后,根据每个群集和每个其他群集之间的链接总数以及每个群集和每个其他群集中的点总数,计算每个群集和每个其他群集之间的良性度量。本发明将两个簇与最佳优度度量合并。因此,通过基于要聚类的数据之间的链接数量合并数据,可以准确有效地执行聚类。

著录项

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号