...
首页> 外文期刊>Journal of Animal Breeding and Genetics >The impact of the rank of marker variance-covariance matrix in principal component evaluation for genomic selection applications. (Special Issue: Use of genomic tools)
【24h】

The impact of the rank of marker variance-covariance matrix in principal component evaluation for genomic selection applications. (Special Issue: Use of genomic tools)

机译:标记方差-协方差矩阵的秩对基因组选择应用的主成分评估的影响。 (特刊:基因组工具的使用)

获取原文
获取原文并翻译 | 示例
           

摘要

In genomic selection (GS) programmes, direct genomic values (DGV) are evaluated using information provided by high-density SNP chip. Being DGV accuracy strictly dependent on SNP density, it is likely that an increase in the number of markers per chip will result in severe computational consequences. Aim of present work was to test the effectiveness of principal component analysis (PCA) carried out by chromosome in reducing the marker dimensionality for GS purposes. A simulated data set of 5700 individuals with an equal number of SNP distributed over six chromosomes was used. PCs were extracted both genome-wide (ALL) and separately by chromosome (CHR) and used to predict DGVs. In the ALL scenario, the SNP variance-covariance matrix (S) was singular, positive semi-definite and contained null information which introduces 'spuriousness' in the derived results. On the contrary, the S matrix for each chromosome (CHR scenario) had a full rank. Obtained DGV accuracies were always better for CHR than ALL. Moreover, in the latter scenario, DGV accuracies became soon unsettled as the number of animals decreases, whereas in CHR, they remain stable till 900-1000 individuals. In real applications where a 54k SNP chip is used, the largest number of markers per chromosome is approximately 2500. Thus, a number of around 3000 genotyped animals could lead to reliable results when the original SNP variables are replaced by a reduced number of PCs.
机译:在基因组选择(GS)程序中,使用高密度SNP芯片提供的信息评估直接基因组值(DGV)。由于DGV的准确性严格取决于SNP密度,每个芯片中标记数量的增加可能会导致严重的计算后果。当前工作的目的是测试染色体进行的主成分分析(PCA)在减少GS标记维数方面的有效性。使用5700个个体的模拟数据集,这些个体具有分布在六个染色体上的SNP数量相等。 PC提取全基因组(ALL),并分别提取染色体(CHR),并用于预测DGV。在ALL场景中,SNP方差-协方差矩阵(S)是奇异的,正半定的,并且包含空信息,这会在得出的结果中引入“虚假性”。相反,每个染色体的S矩阵(CHR方案)具有完整的等级。获得的DGV精度对于CHR总是比ALL更好。此外,在后一种情况下,随着动物数量的减少,DGV的准确性很快变得不稳定,而在CHR中,它们直到900-1000只个体都保持稳定。在使用54k SNP芯片的实际应用中,每个染色体上最大的标记数约为2500。因此,当原始的SNP变量替换为数量减少的PC时,大约3000种基因型动物可能会产生可靠的结果。

著录项

相似文献

  • 外文文献
  • 中文文献
  • 专利
获取原文

客服邮箱:kefu@zhangqiaokeyan.com

京公网安备:11010802029741号 ICP备案号:京ICP备15016152号-6 六维联合信息科技 (北京) 有限公司©版权所有
  • 客服微信

  • 服务号