...
首页> 外文期刊>BMC Bioinformatics >PCA-based population structure inference with generic clustering algorithms
【24h】

PCA-based population structure inference with generic clustering algorithms

机译:基于PCA的总体结构推断和通用聚类算法

获取原文
   

获取外文期刊封面封底 >>

       

摘要

Background Handling genotype data typed at hundreds of thousands of loci is very time-consuming and it is no exception for population structure inference. Therefore, we propose to apply PCA to the genotype data of a population, select the significant principal components using the Tracy-Widom distribution, and assign the individuals to one or more subpopulations using generic clustering algorithms. Results We investigated K-means, soft K-means and spectral clustering and made comparison to STRUCTURE, a model-based algorithm specifically designed for population structure inference. Moreover, we investigated methods for predicting the number of subpopulations in a population. The results on four simulated datasets and two real datasets indicate that our approach performs comparably well to STRUCTURE. For the simulated datasets, STRUCTURE and soft K-means with BIC produced identical predictions on the number of subpopulations. We also showed that, for real dataset, BIC is a better index than likelihood in predicting the number of subpopulations. Conclusion Our approach has the advantage of being fast and scalable, while STRUCTURE is very time-consuming because of the nature of MCMC in parameter estimation. Therefore, we suggest choosing the proper algorithm based on the application of population structure inference.
机译:背景处理在成千上万个基因座处键入的基因型数据非常耗时,并且种群结构推断也不例外。因此,我们建议将PCA应用于人群的基因型数据,使用Tracy-Widom分布选择重要的主要成分,并使用通用聚类算法将个体分配给一个或多个亚群。结果我们研究了K均值,软K均值和频谱聚类,并与STRUCTURE(一种专门为总体结构推断设计的基于模型的算法)进行了比较。此外,我们调查了预测人群中亚种群数量的方法。在四个模拟数据集和两个真实数据集上的结果表明,我们的方法与STRUCTURE的性能相当。对于模拟的数据集,结构和带有BIC的软K均值对子种群的数量产生了相同的预测。我们还表明,对于真实数据集,BIC在预测亚种群数量方面比可能性更好。结论我们的方法具有快速和可扩展的优势,而结构由于MCMC在参数估计中的性质而非常耗时。因此,我们建议根据总体结构推断的应用选择合适的算法。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号