首页> 外文期刊>G3: Genes, Genomes, Genetics >CONE: Community Oriented Network Estimation Is a Versatile Framework for Inferring Population Structure in Large-Scale Sequencing Data
【24h】

CONE: Community Oriented Network Estimation Is a Versatile Framework for Inferring Population Structure in Large-Scale Sequencing Data

机译:锥:面向社区的网络估计是一个多功能的框架,可以推断大规模测序数据中的人口结构

获取原文
           

摘要

Estimation of genetic population structure based on molecular markers is a common task in population genetics and ecology. We apply a generalized linear model with LASSO regularization to infer relationships between individuals and populations from molecular marker data. Specifically, we apply a neighborhood selection algorithm to infer population genetic structure and gene flow between populations. The resulting relationships are used to construct an individual-level population graph. Different network substructures known as communities are then dissociated from each other using a community detection algorithm. Inference of population structure using networks combines the good properties of: (i) network theory (broad collection of tools, including aesthetically pleasing visualization), (ii) principal component analysis (dimension reduction together with simple visual inspection), and (iii) model-based methods ( e.g. , ancestry coefficient estimates). We have named our process CONE (for community oriented network estimation). CONE has fewer restrictions than conventional assignment methods in that properties such as the number of subpopulations need not be fixed before the analysis and the sample may include close relatives or involve uneven sampling. Applying CONE on simulated data sets resulted in more accurate estimates of the true number of subpopulations than model-based methods, and provided comparable ancestry coefficient estimates. Inference of empirical data sets of teosinte single nucleotide polymorphism, bacterial disease outbreak, and the human genome diversity panel illustrate that population structures estimated with CONE are consistent with the earlier findings
机译:基于分子标记的遗传种群结构估计是种群遗传学和生态学中的一项常见任务。我们应用带有LASSO正则化的广义线性模型来从分子标记数据推断出个体与群体之间的关系。具体来说,我们应用邻域选择算法来推断种群的遗传结构和种群之间的基因流。由此产生的关系用于构建个人级别的人口图。然后使用社区检测算法将称为社区的不同网络子结构彼此分离。使用网络推断人口结构结合了以下方面的良好特性:(i)网络理论(广泛的工具集合,包括美观的可视化视图),(ii)主成分分析(减少维数以及简单的视觉检查)和(iii)模型基于方法(例如,祖先系数估计)。我们将过程命名为CONE(用于面向社区的网络估计)。与传统的分配方法相比,CONE的限制更少,因为在分析之前无需确定诸如亚种群数量之类的属性,并且样本可能包含近亲或涉及不均匀抽样。与基于模型的方法相比,在模拟数据集上应用CONE可以更准确地估计亚种群的真实​​数量,并提供可比的祖先系数估计。 teosinte单核苷酸多态性,细菌性疾病暴发和人类基因组多样性面板的经验数据集的推论表明,用CONE估计的种群结构与早期发现一致

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号