首页> 外文学位 >Improving the Accuracy of Genomic Predictions: Investigation of Training Methods and Data Pooling
【24h】

Improving the Accuracy of Genomic Predictions: Investigation of Training Methods and Data Pooling

机译:提高基因组预测的准确性:训练方法和数据池的研究

获取原文
获取原文并翻译 | 示例

摘要

One of the primary factors in the response to selection is the accuracy of selection. This study focused on methodologies to predict breeding values (BV) accurately within multi- and single-step genomic evaluations. Factors including cross-validation methods, dependent variables, and genotyping strategies were assessed on the accuracy of genomic BV while using multi-step prediction in real and simulated data. In both cases, random clustering led to largest estimated accuracies compared to clusters based on k-means, k-medoids, and principle component analysis, but differences in bias were not detected. Using deregressed estimated BV (EBV) to estimate SNP effects led to larger accuracies and smaller standard errors than adjusted phenotypes. Randomly genotyping animals instead of selectively genotyping the top 25% was associated with highest accuracies and least amount of bias.Genetic improvement of economically relevant traits (ERT) should be the goal of breeding programs. Although generally absent in seedstock herds, ERT are routinely collected within commercial sectors; therefore, pooling data was proposed to include commercial information in a cost-effective manner. Pooling involves collecting tissue samples from a group of animals and then combining the DNA to be genotyped as one. The accuracy of EBV when pooled data were used within single-step analysis was investigated through simulation. For a single trait, pool sizes of 2, 10, 20 or 50 did not generally lead to differences in EBV accuracy compared to using individual data when pools were constructed to minimize phenotypic variation. Low accuracy sires benefited the most from pooling, while EBV for the pools could be used for management purposes. For a bivariate analysis, pool sizes of at least 20 were recommended in combination with minimizing phenotypic variation. Additionally, if pools were constructed to minimize phenotypic variation, pooling could be used across a range of genetic correlations (0.1, 0.4, and 0.7) and ways in which missing values arise (randomly missing records or sequential culling). Collectively, these results suggest pooling can be used to include commercial data within genetic evaluations.
机译:对选择做出反应的主要因素之一是选择的准确性。本研究的重点是在多步和单步基因组评估中准确预测育种值 (BV) 的方法。在真实和模拟数据中使用多步预测时,评估包括交叉验证方法、因变量和基因分型策略在内的因素对基因组 BV 的准确性。在这两种情况下,与基于 k-means、k-medoid 和主成分分析的聚类相比,随机聚类导致了最大的估计准确性,但未检测到偏倚差异。与调整后的表型相比,使用回归估计的 BV (EBV) 估计 SNP 效应导致更高的准确性和更小的标准误差。随机对动物进行基因分型而不是选择性地对前 25% 的动物进行基因分型与最高的准确性和最少的偏倚量相关。经济相关性状的遗传改良 (ERT) 应该是育种计划的目标。虽然在种猪群中通常不存在 ERT,但 ERT 通常在商业部门内收集;因此,建议汇集数据以经济高效的方式包含商业信息。混合包括从一组动物中收集组织样本,然后将 DNA 组合为一个 DNA 进行基因分型。通过模拟研究了在单步分析中使用合并数据时 EBV 的准确性。对于单个性状,与构建池以最小化表型变异时使用单个数据相比,2、10、20 或 50 的池大小通常不会导致 EBV 准确性的差异。低准确度公牛从混合中受益最大,而池的 EBV 可用于管理目的。对于双变量分析,建议混合样本量至少为 20 个,同时尽量减少表型变异。此外,如果构建池以最小化表型变异,则可以在一系列遗传相关性(0.1、0.4 和 0.7)和缺失值出现的方式(随机缺失记录或顺序剔除)中使用池化。总的来说,这些结果表明,池化可用于将商业数据纳入遗传评估。

著录项

  • 作者

    Baller, Johnna Lynn.;

  • 作者单位

    The University of Nebraska - Lincoln.;

  • 授予单位 The University of Nebraska - Lincoln.;
  • 学科 Animal sciences.;Agriculture.
  • 学位
  • 年度 2020
  • 页码 172
  • 总页数 172
  • 原文格式 PDF
  • 正文语种 eng
  • 中图分类
  • 关键词

    Animal sciences.; Agriculture.;

    机译:动物科学。;农业。;
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号