首页> 外文会议>Proceedings of the 21st International Conference on Pattern Recognition. >Iterative Neighbor-Joining tree clustering algorithm for genotypic data
【24h】

Iterative Neighbor-Joining tree clustering algorithm for genotypic data

机译:基因型数据的迭代邻接树聚类算法

获取原文
获取原文并翻译 | 示例

摘要

Issues to explore in genotypic datasets include the number and characteristic patterns of subpopulations and, possibly, relationships among them. Model-based clustering methods have been adopted to find a number of clusters and the individual assignments. However, they cannot infer genetic relationships among subpopulations the way phylogenetic trees, e.g., the widely-used Neighbor-Joining (NJ) tree, can. In this paper we propose an unsupervised, iterative clustering framework called iNJclust. It performs clustering on an NJ tree with a graph-based partitioning technique. The iterative process enhances the zooming ability and corrects the topology of the final NJ trees. Inference on genetic similarities between subpopulations is also possible. As final outputs, the iNJclust algorithm provides an estimate of the number of clusters, individual assignments, a population tree, as well as sub-trees of the terminal nodes. We illustrate the superior clustering performance of the proposed algorithm using Human 27 populations, bovine 47 breeds, and sheep 28 breeds datasets.
机译:基因型数据集需要探讨的问题包括亚种群的数量和特征模式,以及它们之间的关系。已采用基于模型的聚类方法来查找多个聚类和各个分配。但是,它们无法像系统进化树(例如,广泛使用的Neighbor-Joining(NJ)树)那样推断子种群之间的遗传关系。在本文中,我们提出了一种称为iNJclust的无监督迭代群集框架。它使用基于图的分区技术在NJ树上执行聚类。迭代过程增强了缩放能力,并校正了最终NJ树的拓扑。也可以推断亚群之间的遗传相似性。作为最终输出,iNJclust算法提供了终端节点的群集数量,单个分配,总体树以及子树的数量的估计。我们使用人类27个种群,牛47个品种和绵羊28个品种数据集说明了该算法的优越聚类性能。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号