首页> 外文期刊>Journal of classification >Note: t for Two (Clusters)
【24h】

Note: t for Two (Clusters)

机译:注意:两个(群集)

获取原文
获取原文并翻译 | 示例
           

摘要

The computation for cluster analysis is done by iterative algorithms. But here, a straightforward, non-iterative procedure is presented for clustering in the special case of one variable and two groups. The method is univariate but may reasonably be applied to multivariate datasets when the first principal component or a single factor explains much of the variation in the data. The t method is motivated by the fact that minimizing the within-groups sum of squares is equivalent to maximizing the between-groups sum of squares, and that Student's t statistic measures the between-groups difference in means relative to within-groups variation. That is, the t statistic is the ratio of the difference in sample means, divided by the standard error of this difference. So, maximizing the t statistic is developed as a method for clustering univariate data into two clusters. In this situation, the t method gives the same results as the K-means algorithm. K-means tacitly assumes equality of variances; here, however, with t, equality of variances need not be assumed because separate variances may be used in computing t. The t method is applied to some datasets; the results are compared with those obtained by fitting mixtures of distributions.
机译:集群分析计算由迭代算法完成。但是,在这里,在一个变量和两组的特殊情况下呈现直接的,非迭代过程。当第一主组件或单个因子解释数据的大部分变化时,该方法是单变量,但可以合理地应用于多变量数据集。 T方法的激励是,最小化群体的平方和相当于最大化组的正方形的基数,并且该学生的T统计数据之间的差异与组之间的差异相对于组变异。也就是说,T统计学是样本装置的差异的比率,除以这种差异的标准误差。因此,将开发最大化T统计数据作为将单变量数据群集成两个集群的方法。在这种情况下,T方法将与K-means算法相同的结果。 K-mean意味着默认呈现差异的平等;然而,在这里,使用T,不需要差异的平等,因为可以在计算T中使用单独的差异。 T方法应用于某些数据集;将结果与通过拟合分布混合物获得的结果进行比较。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号