首页> 外文学位 >Genome-wide prediction of breeding values and mapping of quantitative trait loci in stratified and admixed populations.
【24h】

Genome-wide prediction of breeding values and mapping of quantitative trait loci in stratified and admixed populations.

机译:分层和混合种群中全基因组范围内的育种价值预测和数量性状基因座定位。

获取原文
获取原文并翻译 | 示例

摘要

Ideally genome-wide association studies require homogenous samples originating from randomly mating populations with minimal pedigree relationship. However, in reality such samples are very hard to collect. Non-random mating combined with artificial selection has created complex pattern of population structure and relationship in commercial crop and livestock populations. This requires proper modeling of population structure and kinship a necessary step of all genome-wide association studies. Otherwise, the risk of both false-positives (declaring a marker as significant without it be linked to a QTL) and false-negatives (markers linked to a QTL declared as non-significant) increases dramatically.;In this thesis, we first applied genomic selection (GS) approach to develop equations for prediction of breeding values of purebred candidates based on a model trained on an admixed or crossbred population. In this approach all markers effects are treated as random and are fitted simultaneously. It was hypothesized that given a high-density marker data and using the GS approach; training in a crossbred or admixed population could be as accurate as training in a purebred population that is the target of selection. In a stochastic simulation study, it was shown that both crossbred and admixed populations could predict breeding values of a purebred population, without the need for explicitly modeling of breed composition and pedigree relationship. However, accuracy of GS was greatly reduced when genes from the target pure breed were not included in the admixed or crossbred training population. In addition, it was shown that the accuracy of GS depends on the genetic distance between the training and validation population, the closer the relationship between the two the higher was the prediction accuracy. Further, increasing of marker density improved the accuracy of prediction especially when a crossbred population has been used as the training dataset. Considering haplotypes with weak linkage disequilibrium (LD), the crossbreds showed extensive LD, whereas the LD in the purebreds was confined to smaller segments. In contrast, examination of the length of haplotypes with strong LD indicated that these haplotypes are much shorter in crossbreds than that in purebreds. Our results showed that in crossbred populations the number of haplotypes with strong LD is less than that in the purebred populations. The findings of this research suggested that the crossbred populations are more suitable for QTL fine mapping than the purebreds.;In addition, in another simulation study we compared power, false-positive rate, accuracy and positive predictive value of QTL mapping in an admixed population with and without modeling of breed composition. The performance of ordinary least square (OLS) and mixed model methods (MLM), both fitting one-marker-at-a-time, were compared to that of a Bayesian multiple-regression (BMR) method that fitted all markers simultaneously. The OLS method showed the highest rate of false-positives due to ignoring breed composition and pedigree relationship. The MLM approach showed spurious false-positives when breed composition was not accounted for. The BMR outperformed both OLS and MLM approaches. It was shown that BMR could mitigate the confounding effects of breed composition and relationship without compromising its power. In contrast to the MLM where fitting of breed composition reduced both its power and false-positive rates, when breed composition was considered in the BMR it resulted in loss of power without a change of false-positive rate. It was concluded that the BMR is able to self-correct for the effects of population structure and relatedness.
机译:理想情况下,全基因组关联研究需要来自随机交配种群的同源样本,且系谱关系最小。但是,实际上很难收集这些样本。非随机交配与人工选择相结合,已经在商业作物和牲畜种群中形成了复杂的种群结构格局和关系。这要求对种群结构和亲属进行正确的建模是所有全基因组关联研究的必要步骤。否则,假阳性(声明未与QTL关联的标记为显着标记)和假阴性(声明为与QTL关联的标记为非显着标记)的风险都会大大增加。基因组选择(GS)方法可基于在混合或杂种种群上训练的模型来开发预测纯种候选品种育种值的方程式。在这种方法中,所有标记效果均被视为随机并同时拟合。假设给定高密度标记数据并使用GS方法。在杂交或混合种群中进行的培训与在选择中作为纯种种群进行的培训一样准确。在随机模拟研究中,研究表明,杂种和混交种群都可以预测纯种种群的育种价值,而无需明确地建立品种组成和血统关系的模型。但是,当目标纯种的基因不包括在混合或杂交训练种群中时,GS的准确性会大大降低。此外,已经表明,GS的准确性取决于训练和验证群体之间的遗传距离,两者之间的关系越紧密,预测准确性越高。此外,标记密度的增加提高了预测的准确性,尤其是在将杂交种群用作训练数据集的情况下。考虑到具有弱连锁不平衡(LD)的单倍型,杂种显示出较大的LD,而纯种中的LD限于较小的区段。相反,对具有强LD的单倍体长度的检查表明,杂种中的这些单倍体比纯种中的单倍体短得多。我们的结果表明,在杂种群体中,具有强LD的单倍型的数量少于纯种群体。这项研究的结果表明,杂种种群比纯种种群更适合QTL精细作图。此外,在另一项模拟研究中,我们比较了混合种群中QTL作图的功效,假阳性率,准确性和阳性预测值有或没有模型的品种组成。一次拟合一个标记的普通最小二乘法(OLS)和混合模型方法(MLM)的性能与同时拟合所有标记的贝叶斯多元回归(BMR)方法的性能进行了比较。 OLS方法由于忽略品种组成和血统关系而显示出最高的假阳性率。如果不考虑品种组成,传销方法会显示假的假阳性。 BMR优于OLS和MLM方法。研究表明,BMR可以减轻品种组成和亲缘关系的混杂影响,而不会影响其功能。与MLM相比,适应品种组成会同时降低其功效和假阳性率,而在BMR中考虑品种组成时,这会导致丧失权力,而不会改变假阳性率。结论是,BMR能够自我校正人口结构和相关性的影响。

著录项

  • 作者

    Toosi, Ali S.;

  • 作者单位

    Iowa State University.;

  • 授予单位 Iowa State University.;
  • 学科 Biology Genetics.
  • 学位 Ph.D.
  • 年度 2012
  • 页码 224 p.
  • 总页数 224
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号