【24h】

Statistical Inference for Cluster Trees

机译:聚类树的统计推断

获取原文

摘要

A cluster tree provides a highly-interpretable summary of a density function by representing the hierarchy of its high-density clusters. It is estimated using the empirical tree, which is the cluster tree constructed from a density estimator. This paper addresses the basic question of quantifying our uncertainty by assessing the statistical significance of topological features of an empirical cluster tree. We first study a variety of metrics that can be used to compare different trees, analyze their properties and assess their suitability for inference. We then propose methods to construct and summarize confidence sets for the unknown true cluster tree. We introduce a partial ordering on cluster trees which we use to prune some of the statistically insignificant features of the empirical tree, yielding interpretable and parsimonious cluster trees. Finally, we illustrate the proposed methods on a variety of synthetic examples and furthermore demonstrate their utility in the analysis of a Graft-versus-Host Disease (GvHD) data set.
机译:集群树通过表示其高密度集群的层次结构,提供了密度函数的高度可解释的摘要。使用经验树进行估算,该经验树是根据密度估算器构造的聚类树。本文通过评估经验聚类树的拓扑特征的统计显着性,解决了量化不确定性的基本问题。我们首先研究各种可用于比较不同树木,分析其属性并评估其适用性的度量标准。然后,我们提出了构建和总结未知真实簇树的置信度集的方法。我们在聚类树上引入了部分排序,用于修剪经验树的一些统计上无关紧要的特征,从而产生可解释且简约的聚类树。最后,我们在各种合成示例上说明了所提出的方法,并进一步证明了它们在移植物抗宿主病(GvHD)数据集分析中的效用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号