首页> 外文期刊>Theoretical and Applied Genetics: International Journal of Breeding Research and Cell Genetics >Maximizing genetic differentiation in core collections by PCA-based clustering of molecular marker data.
【24h】

Maximizing genetic differentiation in core collections by PCA-based clustering of molecular marker data.

机译:通过基于PCA的分子标记数据聚类,最大程度地提高核心收藏中的遗传分化。

获取原文
获取原文并翻译 | 示例
       

摘要

Developing genetically diverse core sets is key to the effective management and use of crop genetic resources. Core selection increasingly uses molecular marker-based dissimilarity and clustering methods, under the implicit assumption that markers and genes of interest are genetically correlated. In practice, low marker densities mean that genome-wide correlations are mainly caused by genetic differentiation, rather than by physical linkage. Although of central concern, genetic differentiation per se is not specifically targeted by most commonly employed dissimilarity and clustering methods. Principal component analysis (PCA) on genotypic data is known to effectively describe the inter-locus correlations caused by differentiation, but to date there has been no evaluation of its application to core selection. Here, we explore PCA-based clustering of marker data as a basis for core selection, with the aim of demonstrating its use in capturing genetic differentiation in the data. Using simulated datasets, we show that replacing full-rank genotypic data by the subset of genetically significant PCs leads to better description of differentiation and improves assignment of genotypes to their population of origin. We test the effectiveness of differentiation as a criterion for the formation of core sets by applying a simple new PCA-based core selection method to simulated and actual data and comparing its performance to one of the best existing selection algorithms. We find that although gains in genetic diversity are generally modest, PCA-based core selection is equally effective at maximizing diversity at non-marker loci, while providing better representation of genetically differentiated groups.
机译:发展遗传多样性的核心集是有效管理和利用农作物遗传资源的关键。在隐含的假设(即标记和目标基因具有遗传相关性)的隐含假设下,核心选择越来越多地使用基于分子标记的差异性和聚类方法。在实践中,低标记密度意味着全基因组范围的相关性主要是由遗传分化而不是物理连锁引起的。尽管很重要,但遗传分化本身并没有被最常用的差异和聚类方法专门针对。基因型数据的主成分分析(PCA)可以有效描述因分化引起的基因座间相关性,但迄今为止,尚未评估其在核心选择中的应用。在这里,我们探索基于PCA的标记数据聚类作为核心选择的基础,目的是证明其在捕获数据的遗传分化中的用途。使用模拟的数据集,我们表明,用具有遗传学意义的PC子集代替完整的基因型数据可更好地描述分化,并改善基因型对其起源人群的分配。通过将简单的基于PCA的新的基于核的核心选择方法应用于模拟和实际数据,并将其性能与现有的最佳选择算法之一进行比较,我们测试了区分作为核心组形成标准的有效性。我们发现,尽管遗传多样性的获得通常是适度的,但是基于PCA的核心选择在最大化非标记基因座的多样性方面同样有效,同时可以更好地代表遗传分化群体。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号