首页> 美国卫生研究院文献>other >Clumpak: a program for identifying clustering modes and packaging population structure inferences across K
【2h】

Clumpak: a program for identifying clustering modes and packaging population structure inferences across K

机译:Clumpak:一个用于识别集群模式和打包K人口结构推断的程序

代理获取
本网站仅为用户提供外文OA文献查询和代理获取服务,本网站没有原文。下单后我们将采用程序或人工为您竭诚获取高质量的原文,但由于OA文献来源多样且变更频繁,仍可能出现获取不到、文献不完整或与标题不符等情况,如果获取不到我们将提供退款服务。请知悉。

摘要

The identification of the genetic structure of populations from multilocus genotype data has become a central component of modern population-genetic data analysis. Application of model-based clustering programs often entails a number of steps, in which the user considers different modeling assumptions, compares results across different pre-determined values of the number of assumed clusters (a parameter typically denoted K), examines multiple independent runs for each fixed value of K, and distinguishes among runs belonging to substantially distinct clustering solutions. Here, we present Clumpak (Cluster Markov Packager Across K), a method that automates the post-processing of results of model-based population structure analyses. For analyzing multiple independent runs at a single K value, Clumpak identifies sets of highly similar runs, separating distinct groups of runs that represent distinct modes in the space of possible solutions. This procedure, which generates a consensus solution for each distinct mode, is performed by the use of a Markov clustering algorithm that relies on a similarity matrix between replicate runs, as computed by the software Clumpp. Next, Clumpak identifies an optimal alignment of inferred clusters across different values of K, extending a similar approach implemented for a fixed K in Clumpp, and simplifying the comparison of clustering results across different K values. Clumpak incorporates additional features, such as implementations of methods for choosing K and comparing solutions obtained by different programs, models, or data subsets. Clumpak, available at , simplifies the use of model-based analyses of population structure in population genetics and molecular ecology.
机译:从多基因座基因型数据鉴定种群的遗传结构已成为现代种群遗传数据分析的重要组成部分。基于模型的聚类程序的应用通常需要执行多个步骤,其中用户考虑不同的建模假设,比较假设聚类数(参数通常表示为K)的不同预定值的结果,检查多个独立运行K的每个固定值,并区分属于基本不同的聚类解的运行。在这里,我们介绍Clumpak(跨Kluster Markov Packager),该方法可自动对基于模型的总体结构分析结果进行后处理。为了分析单个K值下的多个独立运行,Clumpak可以识别高度相似的运行集,将代表可能模式空间中不同模式的不同运行组分隔开。通过使用马尔可夫聚类算法执行此过程,该过程为每个不同的模式生成共识解决方案,该算法依赖于复制运行之间的相似性矩阵(由软件Clumpp计算)。接下来,Clumpak会确定跨不同K值的推断聚类的最佳对齐方式,扩展对Clumpp中固定K实施的类似方法,并简化跨不同K值的聚类结果的比较。 Clumpak合并了其他功能,例如用于选择K并比较由不同程序,模型或数据子集获得的解决方案的方法的实现。可在上找到的Clumpak简化了人口遗传学和分子生态学中基于模型的人口结构分析的使用。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
代理获取

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号