...
首页> 外文期刊>Chemometrics and Intelligent Laboratory Systems >DiPCA_Cluster: An optimal alternative to DiPLS_Cluster for unsupervised classification
【24h】

DiPCA_Cluster: An optimal alternative to DiPLS_Cluster for unsupervised classification

机译:DiPCA_Cluster:非监督分类的最佳替代DiPLS_Cluster

获取原文
获取原文并翻译 | 示例
           

摘要

Unsupervised cluster analysis is frequently used to explore the structure of large datasets. This paper introduces a new method, DiPCA_Cluster, which is an optimal alternative to DiPLS_Cluster with two additional refinements. Indeed, DiPLS_Cluster is not optimal as it does not converge to a unique solution (it depends on the algorithm initialization) and moreover, provides dendrograms which are not very meaningful. The method proposed in this paper, DiPCA_Cluster is optimal with regards to the chosen criterion, inertia. Furthermore, at each step of the analysis, the data have to be split into two groups thanks to the values of a 0/1 vector. In our method, unlike DiPLS_Cluster, this split is based on the use of a threshold value which is more in accordance with the general problem, in this goal, two proposals are suggested. Finally, whereas DiPLS_Cluster set up the dendrogram by using the number of iterations, DiPCA_Cluster uses the total variance explained by each principal component within each group which is more justified and which allows to find out dendrograms very close to the ones provided by Ward criterion.
机译:无监督聚类分析通常用于探索大型数据集的结构。本文介绍了一种新方法DiPCA_Cluster,它是DiPLS_Cluster的最佳替代方案,另外还有两个改进。实际上,DiPLS_Cluster并不是最优的,因为它不能收敛到唯一的解决方案(取决于算法的初始化),而且提供的树状图意义不大。本文提出的方法DiPCA_Cluster在选择标准,惯性方面是最佳的。此外,在分析的每个步骤中,由于0/1向量的值,必须将数据分为两组。在我们的方法中,与DiPLS_Cluster不同,此拆分基于阈值的使用,该阈值更符合一般问题,为此,提出了两个建议。最后,虽然DiPLS_Cluster通过使用迭代次数来建立树状图,但DiPCA_Cluster使用由每个组中每个主成分解释的总方差,这是比较合理的,它允许找出非常接近Ward准则提供的树状图。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号