首页> 外国专利> User-controlled iterative sub-clustering of large data sets guided by statistical heuristics

User-controlled iterative sub-clustering of large data sets guided by statistical heuristics

机译:在统计启发式的指导下,用户控制的大数据集的迭代子集群

摘要

The current invention is related to data analysis, and in particular, various methods for cluster analysis. It provides a method that aims to summarize and illustrate an original data set by means of breaking it iteratively into sub-divisions, altogether comprising a hierarchical cluster structure. The method comprises at least the steps of collecting a parametrically predetermined number of samples from a given original data set in which each data item is described by a vector of values, and iterating each of the following steps at least once: presenting to the user the hierarchical cluster structure composed by already completed iterations, the list of variables specified by the data set presented in a manner that indicates a heuristic for optimal distinctivity within the cluster, receiving from the user a selection of a supercluster to be sub-divided and a sub-divisive variable, collecting a sample of a fixed number of items from the original data set such that fall within the union of interval values for each of the variables that defined the supercluster in previous iterations, and performing a sub-division on said elected divisive variable on said cluster.
机译:本发明涉及数据分析,尤其涉及用于聚类分析的各种方法。它提供了一种方法,该方法旨在通过将原始数据集迭代地细分为多个细分(包括分层的群集结构)来汇总和说明原始数据集。该方法至少包括以下步骤:从给定的原始数据集中收集参数预定数量的样本,在该原始数据集中,每个数据项均由值的向量描述;以及以下每个迭代步骤至少重复一次:向用户呈现由已经完成的迭代组成的分层聚类结构,由数据集指定的变量列表以指示聚类内最佳区分性的启发式方式呈现,并从用户那里接收要细分的超集群的选择和一个子集-除数变量,从原始数据集中收集固定数量项目的样本,以使它们属于在先前迭代中定义超集群的每个变量的区间值的并集,并对所述选择的除数进行细分所述集群上的变量。

著录项

  • 公开/公告号US2018365279A1

    专利类型

  • 公开/公告日2018-12-20

    原文格式PDF

  • 申请/专利权人 PERSPICAMUS AB;

    申请/专利号US201816010574

  • 发明设计人 MAURI KAIPAINEN;

    申请日2018-06-18

  • 分类号G06F17/30;

  • 国家 US

  • 入库时间 2022-08-21 12:09:48

相似文献

  • 专利
  • 外文文献
  • 中文文献
获取专利

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号